Bayesian GLM Part4

Author

Murray Logan

Published

30/07/2023

1 Preparations

Load the necessary libraries

library(tidyverse)  #for data wrangling etc
library(rstanarm)   #for fitting models in STAN
library(cmdstanr)   #for cmdstan
library(brms)       #for fitting models in STAN
library(car)
library(tidybayes)  #for more tidying outputs
library(standist)   #for exploring distributions
library(coda)       #for diagnostics
library(bayesplot)  #for diagnostics
library(ggmcmc)     #for MCMC diagnostics
library(rstan)      #for interfacing with STAN
library(HDInterval) #for HPD intervals
library(emmeans)    #for marginal means etc
library(DHARMa)     #for residual diagnostics
library(broom)      #for tidying outputs
library(ggeffects)  #for partial plots
library(broom.mixed)#for summarising models
library(ggeffects)  #for partial effects plots
library(patchwork)  #for multiple plots
library(ggridges)   #for ridge plots 
library(bayestestR) #for ROPE
library(see)        #for some plots
library(easystats)  #for the easystats ecosystem
source("helperFunctions.R")

2 Scenario

Loyn (1987) modelled the abundance of forest birds with six predictor variables (patch area, distance to nearest patch, distance to nearest larger patch, grazing intensity, altitude and years since the patch had been isolated).

Figure 1: Regent honeyeater
Table 1: Format of loyn.csv data file
ABUND DIST LDIST AREA GRAZE ALT YR.ISOL
.. .. .. .. .. .. ..
Table 2: Description of the variables in the loyn data file
ABUND Abundance of forest birds in patch- response variable
DIST Distance to nearest patch - predictor variable
LDIST Distance to nearest larger patch - predictor variable
AREA Size of the patch - predictor variable
GRAZE Grazing intensity (1 to 5, representing light to heavy) - predictor variable
ALT Altitude - predictor variable
YR.ISOL Number of years since the patch was isolated - predictor variable

The aim of the analysis is to investigate the effects of a range of predictors on the abundance of forest birds.

3 Read in the data

loyn <- read_csv('../public/data/loyn.csv', trim_ws=TRUE)
Rows: 56 Columns: 7
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
dbl (7): ABUND, AREA, YR.ISOL, DIST, LDIST, GRAZE, ALT

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
loyn |> glimpse()
Rows: 56
Columns: 7
$ ABUND   <dbl> 5.3, 2.0, 1.5, 17.1, 13.8, 14.1, 3.8, 2.2, 3.3, 3.0, 27.6, 1.8…
$ AREA    <dbl> 0.1, 0.5, 0.5, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.…
$ YR.ISOL <dbl> 1968, 1920, 1900, 1966, 1918, 1965, 1955, 1920, 1965, 1900, 19…
$ DIST    <dbl> 39, 234, 104, 66, 246, 234, 467, 284, 156, 311, 66, 93, 39, 40…
$ LDIST   <dbl> 39, 234, 311, 66, 246, 285, 467, 1829, 156, 571, 332, 93, 39, …
$ GRAZE   <dbl> 2, 5, 5, 3, 5, 3, 5, 5, 4, 5, 3, 5, 2, 1, 5, 5, 3, 3, 3, 2, 2,…
$ ALT     <dbl> 160, 60, 140, 160, 140, 130, 90, 60, 130, 130, 210, 160, 210, …
loyn |> str()
spc_tbl_ [56 × 7] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ ABUND  : num [1:56] 5.3 2 1.5 17.1 13.8 14.1 3.8 2.2 3.3 3 ...
 $ AREA   : num [1:56] 0.1 0.5 0.5 1 1 1 1 1 1 1 ...
 $ YR.ISOL: num [1:56] 1968 1920 1900 1966 1918 ...
 $ DIST   : num [1:56] 39 234 104 66 246 234 467 284 156 311 ...
 $ LDIST  : num [1:56] 39 234 311 66 246 ...
 $ GRAZE  : num [1:56] 2 5 5 3 5 3 5 5 4 5 ...
 $ ALT    : num [1:56] 160 60 140 160 140 130 90 60 130 130 ...
 - attr(*, "spec")=
  .. cols(
  ..   ABUND = col_double(),
  ..   AREA = col_double(),
  ..   YR.ISOL = col_double(),
  ..   DIST = col_double(),
  ..   LDIST = col_double(),
  ..   GRAZE = col_double(),
  ..   ALT = col_double()
  .. )
 - attr(*, "problems")=<externalptr> 
## Explore the first 6 rows of the data
loyn |> head()
# A tibble: 6 × 7
  ABUND  AREA YR.ISOL  DIST LDIST GRAZE   ALT
  <dbl> <dbl>   <dbl> <dbl> <dbl> <dbl> <dbl>
1   5.3   0.1    1968    39    39     2   160
2   2     0.5    1920   234   234     5    60
3   1.5   0.5    1900   104   311     5   140
4  17.1   1      1966    66    66     3   160
5  13.8   1      1918   246   246     5   140
6  14.1   1      1965   234   285     3   130
loyn |> datawizard::data_codebook()
loyn (56 rows and 7 variables, 7 shown)

ID | Name    | Type    | Missings |       Values |          N
---+---------+---------+----------+--------------+-----------
1  | ABUND   | numeric | 0 (0.0%) |  [1.5, 39.6] |         56
---+---------+---------+----------+--------------+-----------
2  | AREA    | numeric | 0 (0.0%) |  [0.1, 1771] |         56
---+---------+---------+----------+--------------+-----------
3  | YR.ISOL | numeric | 0 (0.0%) | [1890, 1976] |         56
---+---------+---------+----------+--------------+-----------
4  | DIST    | numeric | 0 (0.0%) |   [26, 1427] |         56
---+---------+---------+----------+--------------+-----------
5  | LDIST   | numeric | 0 (0.0%) |   [26, 4426] |         56
---+---------+---------+----------+--------------+-----------
6  | GRAZE   | numeric | 0 (0.0%) |            1 | 13 (23.2%)
   |         |         |          |            2 |  8 (14.3%)
   |         |         |          |            3 | 15 (26.8%)
   |         |         |          |            4 |  7 (12.5%)
   |         |         |          |            5 | 13 (23.2%)
---+---------+---------+----------+--------------+-----------
7  | ALT     | numeric | 0 (0.0%) |    [60, 260] |         56
-------------------------------------------------------------

4 Exploratory data analysis

When we explored this analysis from a frequentist perspective, we decided on a log-normal like model. This was a model that was fit against a Gaussian distribution, yet with a log-link. We will replicate that model here in a Bayesian framework.

In the previous exploration of this model, we elected to treat Grazing intensity as a categorical variable - we will again code Grazing intensity as a categorical variable.

loyn <- loyn |> mutate(fGRAZE = factor(GRAZE))

Model formula: \[ y_i \sim{} \mathcal{N}(\mu_i, \sigma^2)\\ log(\mu_i) = \boldsymbol{\beta} \bf{X_i}\\ \beta_0 \sim{} \mathcal{N}(3,0.5)\\ \beta_{1-9} \sim{} \mathcal{N}(0,2.5)\\ \sigma \sim{} \mathcal{Gamma}(2,1)\\ OR\\ \sigma \sim{} \mathcal{t}(3,0,2.5) \]

where \(\boldsymbol{\beta}\) is a vector of effects parameters and \(\bf{X}\) is a model matrix representing the additive effects of the scaled versions of distance (ln), distance to the nearest large patch (ln), patch area (ln), grazing intensity, year of isolation and altitude on the abundance of forest birds.

4.1 Scatterplot matrix

To re-acquaint ourselves with the data, I we will revisit the scatterplot matrix that we generated prior to the frequentist analysis.

scatterplotMatrix(~ABUND+DIST+LDIST+AREA+GRAZE+ALT+YR.ISOL, data = loyn,
                  diagonal = list(method = 'boxplot'))

We again notice that DIST, LDIST and AREA are skewed, so we will normalise them via a logarithmic transformation.

scatterplotMatrix(~ABUND+log(DIST)+log(LDIST)+log(AREA)+GRAZE+ALT+YR.ISOL, data = loyn,
                  diagonal = list(method = 'boxplot'))

5 Fit the model

loyn.glm <- glm(ABUND~scale(log(DIST))+scale(log(LDIST))+scale(log(AREA))+
                    fGRAZE + scale(ALT) + scale(YR.ISOL),
                data = loyn,
                family = gaussian(link='log'))
loyn.glm |> summary()

Call:
glm(formula = ABUND ~ scale(log(DIST)) + scale(log(LDIST)) + 
    scale(log(AREA)) + fGRAZE + scale(ALT) + scale(YR.ISOL), 
    family = gaussian(link = "log"), data = loyn)

Coefficients:
                   Estimate Std. Error t value Pr(>|t|)    
(Intercept)        3.092248   0.112856  27.400  < 2e-16 ***
scale(log(DIST))  -0.018067   0.057247  -0.316  0.75373    
scale(log(LDIST))  0.057086   0.059984   0.952  0.34623    
scale(log(AREA))   0.203976   0.059620   3.421  0.00132 ** 
fGRAZE2            0.039644   0.148978   0.266  0.79134    
fGRAZE3            0.019654   0.137752   0.143  0.88717    
fGRAZE4           -0.001197   0.156199  -0.008  0.99392    
fGRAZE5           -1.121563   0.343631  -3.264  0.00208 ** 
scale(ALT)        -0.003237   0.048607  -0.067  0.94719    
scale(YR.ISOL)    -0.018246   0.074404  -0.245  0.80737    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for gaussian family taken to be 42.40928)

    Null deviance: 6337.9  on 55  degrees of freedom
Residual deviance: 1950.8  on 46  degrees of freedom
AIC: 379.76

Number of Fisher Scoring iterations: 6

In rstanarm, the default priors are designed to be weakly informative. They are chosen to provide moderate regularisation (to help prevent over-fitting) and help stabilise the computations.

loyn.rstanarm = stan_glm(ABUND ~ scale(log(DIST))+
                              scale(log(LDIST))+
                              scale(log(AREA))+
                              fGRAZE+
                              scale(ALT)+
                              scale(YR.ISOL),
                         data=loyn,
                         family=gaussian(link='log'), 
                         iter = 5000, warmup = 2500,
                         chains = 3, thin = 5, refresh = 0)
Warning: There were 154 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.
Warning: There were 2 chains where the estimated Bayesian Fraction of Missing Information was low. See
https://mc-stan.org/misc/warnings.html#bfmi-low
Warning: Examine the pairs() plot to diagnose sampling problems
Warning: The largest R-hat is 3.91, indicating chains have not mixed.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#r-hat
Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#bulk-ess
Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#tail-ess
Warning: Markov chains did not converge! Do not analyze results!

Conclusions:

  • the warning messages suggest that this model has not performed well
    • there are a large number of divergent transitions. Divergent transitions occur when the sampler is in a ‘sharp’ part of the posterior and the step length is too large causing the sampler to shoot off the posterior and reject the sample. This leads to inefficient and ineffective MCMC sampling. To alleviate this, we can either:
      • review the model itself - it might be misspecified
      • adjust the adaptive delta - a parameter that governs the degree of step-length learning that occurs during the warmup stage - the more it learns the less likely the sampler will be to overshoot (although it will take longer to learn)
      • review the priors
    • the largest R-hat value is large. This suggests that the chains have not converged well
    • the effective sample sizes is too low, suggesting that there might be issues with the priors.
loyn.rstanarm = stan_glm(ABUND ~ scale(log(DIST))+
                              scale(log(LDIST))+
                              scale(log(AREA))+
                              fGRAZE+
                              scale(ALT)+
                              scale(YR.ISOL),
                         data=loyn,
                         family=gaussian(link='log'), 
                         iter = 5000, warmup = 2500,
                         chains = 3, thin = 5, refresh = 0,
                         adapt_delta = 0.99)
Warning: There were 1039 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.
Warning: There were 3 chains where the estimated Bayesian Fraction of Missing Information was low. See
https://mc-stan.org/misc/warnings.html#bfmi-low
Warning: Examine the pairs() plot to diagnose sampling problems
Warning: The largest R-hat is 5.98, indicating chains have not mixed.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#r-hat
Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#bulk-ess
Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#tail-ess
Warning: Markov chains did not converge! Do not analyze results!

There are still a substantial number of divergent transitions. It is possible that the priors are too vague.

prior_summary(loyn.rstanarm)
Priors for model 'loyn.rstanarm' 
------
Intercept (after predictors centered)
  Specified prior:
    ~ normal(location = 0, scale = 2.5)
  Adjusted prior:
    ~ normal(location = 0, scale = 27)

Coefficients
  Specified prior:
    ~ normal(location = [0,0,0,...], scale = [2.5,2.5,2.5,...])
  Adjusted prior:
    ~ normal(location = [0,0,0,...], scale = [26.84,26.84,26.84,...])

Auxiliary (sigma)
  Specified prior:
    ~ exponential(rate = 1)
  Adjusted prior:
    ~ exponential(rate = 0.093)
------
See help('prior_summary.stanreg') for more details

Conclusions:

  • the default priors appear to be overly wide. We will instead define our own priors. ### Assessing priors
loyn.rstanarm1 <- update(loyn.rstanarm,  prior_PD=TRUE)
ggemmeans(loyn.rstanarm1,  ~AREA) |> plot(add.data=TRUE) + scale_y_log10()
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.

ggpredict(loyn.rstanarm1) |>
    plot(add.data = TRUE) |>
    wrap_plots() &
    scale_y_log10()
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.

Conclusions:

  • we see that the range of predictions are extremely wide and the slope could range from strongly negative to strongly positive.
  • \(\beta_0\): normal centred at 3 with a standard deviation of 1 (on the log scale, these are the equivalent of 20.1 and 1 respectively)
    • mean of 3: since mean(log(loyn$ABUND))
    • sd of 1: since sd(log(loyn$ABUND))
  • \(\beta_1\): normal centred at 0 with a standard deviation of 2.5 (again, consider what this would be on a response scale)
    • sd of 1: since sd(log(loyn$ABUND))/sd(scale(log(loyn$AREA))) (and since each predictor is scaled, it will be the same for each predictor)
  • \(\sigma\): half cauchy (flat Gaussian) prior with a scale of 2.

I will also overlay the raw data for comparison.

loyn.rstanarm2 <- stan_glm(ABUND ~ scale(log(DIST))+
                             scale(log(LDIST))+
                              scale(log(AREA))+
                              fGRAZE+
                              scale(ALT)+
                              scale(YR.ISOL), data=loyn,
                          family=gaussian(link='log'),
                          prior_intercept = normal(3, 1,  autoscale=FALSE),
                          prior = normal(0, 1, autoscale=FALSE),
                          prior_aux = cauchy(0, 2),
                          prior_PD=TRUE, 
                          iter = 5000, thin=5,
                          chains = 3, warmup=2500, 
                          refresh=0) 

Conclusions:

  • there are no longer any warning messages.
ggemmeans(loyn.rstanarm2,  ~AREA) |>
  plot(add.data=TRUE) + scale_y_log10()
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.

ggpredict(loyn.rstanarm2) |>
    plot(add.data = TRUE) |>
    wrap_plots() &
    scale_y_log10()
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.

Now lets refit, conditioning on the data.

loyn.rstanarm3= update(loyn.rstanarm2,  prior_PD=FALSE)
posterior_vs_prior(loyn.rstanarm3, color_by='vs', group_by=TRUE,
                   facet_args=list(scales='free_y'))

Drawing from prior...

Conclusions:

  • in each case, the prior is substantially wider than the posterior, suggesting that the posterior is not biased towards the prior.
#ggemmeans(loyn.rstanarm3,  ~AREA) |> plot(add.data=TRUE) + scale_y_log10()
ggpredict(loyn.rstanarm3,  terms = "AREA[0:1000]") |>
    plot(jitter = FALSE, add.data=TRUE, log.y = TRUE) + scale_x_log10()
Warning: Transformation introduced infinite values in continuous x-axis
Transformation introduced infinite values in continuous x-axis
Warning: Removed 1 row containing missing values (`geom_line()`).

## ggpredict(loyn.rstanarm3,  terms = "AREA[0:1000]") |>
##     plot(jitter = FALSE, residuals=TRUE, log.y = TRUE) + scale_x_log10()
ggpredict(loyn.rstanarm3) |>
    plot(add.data = TRUE) |>
    wrap_plots() &
    scale_y_log10()
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.

Note that for the second of the above are partial plots (effect of one predictor at the average of the continuous predictors and the first level of the categorical predictor), the raw data are not similarly standardised and thus may appear not to match the trends…

In brms, the default priors are designed to be weakly informative. They are chosen to provide moderate regularisation (to help prevent over-fitting) and help stabilise the computations.

Unlike rstanarm, brms models must be compiled before they start sampling. For most models, the compilation of the stan code takes around 45 seconds.

loyn.form <- bf(ABUND ~ scale(log(DIST))+
                       scale(log(LDIST))+
                       scale(log(AREA))+
                       fGRAZE+
                       scale(ALT)+
                       scale(YR.ISOL),
                   family = gaussian(link = 'log'))
loyn.brm <- brm(loyn.form,
                data = loyn,
                iter = 5000,
                warmup = 2500,
                chains = 3, cores = 3,
                thin = 5,
                refresh = 0,
                backend = "cmdstan")
Start sampling
Running MCMC with 3 parallel chains...

Chain 1 finished in 0.9 seconds.
Chain 3 finished in 1.8 seconds.
Chain 2 finished in 2.0 seconds.

All 3 chains finished successfully.
Mean chain execution time: 1.6 seconds.
Total execution time: 2.2 seconds.
loyn.brm |> prior_summary()
                 prior     class          coef group resp dpar nlpar lb ub       source
                (flat)         b                                                default
                (flat)         b       fGRAZE2                             (vectorized)
                (flat)         b       fGRAZE3                             (vectorized)
                (flat)         b       fGRAZE4                             (vectorized)
                (flat)         b       fGRAZE5                             (vectorized)
                (flat)         b      scaleALT                             (vectorized)
                (flat)         b  scalelogAREA                             (vectorized)
                (flat)         b  scalelogDIST                             (vectorized)
                (flat)         b scalelogLDIST                             (vectorized)
                (flat)         b  scaleYR.ISOL                             (vectorized)
  student_t(3, 3, 2.5) Intercept                                                default
 student_t(3, 0, 11.8)     sigma                                      0         default
priors <- prior(normal(0, 2.5), class = 'b')
loyn.form <- bf(ABUND ~ scale(log(DIST))+
                    scale(log(LDIST))+
                    scale(log(AREA))+
                    fGRAZE+
                    scale(ALT)+
                    scale(YR.ISOL),
                family = gaussian(link = 'log'))
loyn.brm1 <- brm(loyn.form,
                 data = loyn, 
                 prior = priors,
                 sample_prior = 'only', 
                 iter = 5000,
                 warmup = 2500,
                 chains = 3,
                 thin = 5, 
                 refresh = 0,
                 backend = "cmdstan")
Start sampling
Running MCMC with 3 sequential chains...

Chain 1 finished in 0.1 seconds.
Chain 2 finished in 0.1 seconds.
Chain 3 finished in 0.1 seconds.

All 3 chains finished successfully.
Mean chain execution time: 0.1 seconds.
Total execution time: 0.5 seconds.
## Individual plots - the following seems to be broken??
##loyn.brm1 |> ggemmeans(~AREA) |> plot(add.data = TRUE) + scale_y_log10()
loyn.brm1 |> 
    ggemmeans(~AREA) |>
    plot(add.data = TRUE) + scale_y_log10()
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.

## All effects 
loyn.brm1 |>
    conditional_effects() |>
    plot(points = TRUE, ask = FALSE, plot = FALSE) |> 
    wrap_plots() &
    scale_y_log10()

loyn.brm1 |>
    ggpredict() |>
    plot(add.data=TRUE, facet=TRUE) +
    scale_y_log10() 
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Warning: Transformation introduced infinite values in continuous y-axis
Transformation introduced infinite values in continuous y-axis
Transformation introduced infinite values in continuous y-axis

Conclusions:

  • we see that the range of predictions is fairly wide and the slope could range from strongly negative to strongly positive.
  • \(\beta_0\): normal centred at 3 with a standard deviation of 1 (on the log scale, these are the equivalent of 20.1 and 1 respectively)
    • mean of 3: since median(log(loyn$ABUND))
    • sd of 0.5: since mad(log(loyn$ABUND))
  • \(\beta_1\): normal centred at 0 with a standard deviation of 2.5 (again, consider what this would be on a response scale)
    • sd of 1: since mad(log(loyn$ABUND))/mad(scale(log(loyn$AREA))) (and since each predictor is scaled, it will be the same for each predictor)
  • \(\sigma\): half cauchy (flat Gaussian) prior with a scale of 1.
mod.mat <- model.matrix(as.formula(loyn.form), data = loyn)
mad(log(loyn$ABUND))/
    apply(mod.mat, 2, mad)
      (Intercept)  scale(log(DIST)) scale(log(LDIST))  scale(log(AREA)) 
              Inf         0.6280229         0.5626313         0.5009688 
          fGRAZE2           fGRAZE3           fGRAZE4           fGRAZE5 
              Inf               Inf               Inf               Inf 
       scale(ALT)    scale(YR.ISOL) 
        0.5127373         1.3907494 
priors <- prior(normal(3, 0.5),  class = 'Intercept') +
    prior(normal(0, 1.5), class = 'b') +
    ## prior(gamma(1, 1), class = 'sigma')
    prior(student_t(3, 0, 2.5), class = 'sigma')
loyn.form <- bf(ABUND ~ scale(log(DIST))+
                     scale(log(LDIST))+
                     scale(log(AREA))+
                     fGRAZE+
                     scale(ALT)+
                     scale(YR.ISOL),
                   family = gaussian(link = 'log'))
                   ## family = lognormal())
loyn.brm2 <- brm(loyn.form,
                data = loyn, 
                prior = priors,
                sample_prior = 'only', 
                iter = 5000,
                warmup = 2500,
                chains = 3, cores = 3,
                thin = 5,
                refresh = 0,
                backend = "cmdstan")
Start sampling
Running MCMC with 3 parallel chains...

Chain 1 finished in 0.0 seconds.
Chain 2 finished in 0.1 seconds.
Chain 3 finished in 0.0 seconds.

All 3 chains finished successfully.
Mean chain execution time: 0.0 seconds.
Total execution time: 0.2 seconds.
loyn.brm2 |> ggemmeans(~DIST) |>
    plot(add.data = TRUE) + 
    scale_y_log10()
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.

loyn.brm2 |>
    conditional_effects() |>
    plot(points = TRUE, ask = FALSE, plot = FALSE) |>
    ## lapply(function(x) x + scale_y_log10()) |>
    wrap_plots() &
    scale_y_log10()

loyn.brm2 |>
    ggpredict() |>
    plot(add.data=TRUE, facet=TRUE) +
    scale_y_log10() 
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Warning: Transformation introduced infinite values in continuous y-axis
Transformation introduced infinite values in continuous y-axis
Transformation introduced infinite values in continuous y-axis

loyn.brm3 <- update(loyn.brm2,  sample_prior = 'yes', refresh = 0)
The desired updates require recompiling the model
Start sampling
Running MCMC with 3 sequential chains...

Chain 1 finished in 0.3 seconds.
Chain 2 finished in 0.3 seconds.
Chain 3 finished in 0.3 seconds.

All 3 chains finished successfully.
Mean chain execution time: 0.3 seconds.
Total execution time: 1.3 seconds.
loyn.brm3 |> get_variables()
 [1] "b_Intercept"     "b_scalelogDIST"  "b_scalelogLDIST" "b_scalelogAREA" 
 [5] "b_fGRAZE2"       "b_fGRAZE3"       "b_fGRAZE4"       "b_fGRAZE5"      
 [9] "b_scaleALT"      "b_scaleYR.ISOL"  "sigma"           "prior_Intercept"
[13] "prior_b"         "prior_sigma"     "lprior"          "lp__"           
[17] "accept_stat__"   "treedepth__"     "stepsize__"      "divergent__"    
[21] "n_leapfrog__"    "energy__"       
## loyn.brm3 |> hypothesis('Intercept = 0', class = 'b') |> plot
## loyn.brm3 |> hypothesis('Intercept = 0', class = 'prior') |> plot
loyn.brm3 |> hypothesis('scalelogDIST = 0') |> plot()

loyn.brm3 |> hypothesis('scalelogAREA = 0') |> plot()

loyn.brm3 |> hypothesis('sigma = 0', class = '') |> plot()

loyn.brm3 |> SUYR_prior_and_posterior()

loyn.brm3 |> standata()
$N
[1] 56

$Y
 [1]  5.3  2.0  1.5 17.1 13.8 14.1  3.8  2.2  3.3  3.0 27.6  1.8 21.2 14.6  8.0
[16]  3.5 29.0  2.9 24.3 19.4 24.4  5.0 15.8 25.3 19.5 20.9 16.3 18.8 19.9 13.0
[31]  6.8 21.7 27.8 26.8 16.6 30.4 11.5 26.0 25.7 12.7 23.5 24.9 29.0 28.3 28.3
[46] 32.0 37.7 39.6 29.6 31.0 34.4 27.3 30.5 33.0 29.5 30.9

$K
[1] 10

$X
   Intercept scalelogDIST scalelogLDIST scalelogAREA fGRAZE2 fGRAZE3 fGRAZE4
1          1  -1.51019511   -1.65892116  -2.37802367       1       0       0
2          1   0.37029448   -0.30532977  -1.51765965       0       0       0
3          1  -0.48079400   -0.09042449  -1.51765965       0       0       0
4          1  -0.95804924   -1.26148217  -1.14712103       0       1       0
5          1   0.42278148   -0.26754922  -1.14712103       0       0       0
6          1   0.37029448   -0.15637842  -1.14712103       0       1       0
7          1   1.09552219    0.21669491  -1.14712103       0       0       0
8          1   0.57353754    1.24803687  -1.14712103       0       0       0
9          1  -0.05524976   -0.61163991  -1.14712103       0       0       1
10         1   0.66885367    0.36858640  -1.14712103       0       0       0
11         1  -0.95804924   -0.04106159  -0.77658241       0       1       0
12         1  -0.59812145   -1.00240327  -0.77658241       0       0       0
13         1  -1.51019511   -1.65892116  -0.77658241       1       0       0
14         1   0.93822292    0.10346964  -0.77658241       0       0       0
15         1   0.47682817   -0.22864597  -0.77658241       0       0       0
16         1  -0.24660010    0.43442972  -0.77658241       0       0       0
17         1  -1.93573935   -1.96523129  -0.55983122       0       1       0
18         1  -1.93573935   -1.96523129  -0.55983122       0       1       0
19         1  -1.48362354   -1.63979473  -0.40604380       0       1       0
20         1   0.47682817   -0.22864597  -0.40604380       1       0       0
21         1   0.37029448    0.29645165  -0.40604380       1       0       0
22         1  -1.93573935    1.38927509  -0.40604380       0       0       0
23         1  -1.51019511   -1.65892116  -0.28675700       0       1       0
24         1   0.85682391    0.04487798  -0.28675700       0       0       0
25         1  -0.59812145   -0.33160908  -0.18929260       0       1       0
26         1  -0.03525827    0.79868571  -0.18929260       0       1       0
27         1   0.57722655    0.69705984  -0.10688762       0       1       0
28         1  -0.22265561   -0.73213997  -0.10688762       0       0       1
29         1   0.50481706   -0.20849935  -0.03550518       0       0       1
30         1   0.79284436    1.26397616   0.02745860       0       0       0
31         1   0.75311975    0.29645165   0.08378161       0       1       0
32         1  -1.89613008   -1.93672022   0.13473198       0       0       1
33         1  -0.03525827   -0.59724988   0.18124602       0       0       1
34         1  -0.22265561   -0.73213997   0.18124602       0       0       1
35         1  -0.12477989   -0.66168825   0.22403479       1       0       0
36         1  -0.12477989    0.09591504   0.30053281       0       1       0
37         1   0.90372228    1.51230754   0.36744180       0       0       0
38         1  -1.48362354    1.66778539   0.39799721       1       0       0
39         1   0.50481706    0.99128245   0.42690016       0       0       1
40         1   0.66885367    1.53439756   0.50527059       0       0       0
41         1   1.35327186    0.40222517   0.59457340       0       0       0
42         1   1.25762760    0.59446805   0.65294853       0       1       0
43         1   0.24667868   -0.39430941   0.70557205       0       0       0
44         1  -0.95804924   -0.01204503   0.73798041       0       0       0
45         1   0.57722655   -0.51197475   0.82485885       1       0       0
46         1  -0.59812145   -1.00240327   0.87580921       0       1       0
47         1   0.47682817    0.98837574   0.92232325       0       1       0
48         1   2.26783778    1.12640241   0.93334579       0       0       0
49         1   0.92772762    1.07832552   0.94414564       0       0       0
50         1   1.09552219    0.21669491   1.01418997       0       0       0
51         1  -1.51019511    0.29645165   1.29286187       1       0       0
52         1   0.93822292    1.91565164   1.35582564       0       0       0
53         1   1.09552219    1.39201101   1.47113789       0       0       0
54         1  -0.12477989   -0.07123733   1.50961306       0       0       0
55         1   0.75311975    1.00336997   2.53095496       0       0       0
56         1   0.73743154   -0.04106159   2.85111978       0       0       0
   fGRAZE5    scaleALT scaleYR.ISOL
1        0  0.31588146    0.7134084
2        1 -1.98143824   -1.1629534
3        1 -0.14358248   -1.9447708
4        0  0.31588146    0.6352266
5        1 -0.14358248   -1.2411351
6        0 -0.37331445    0.5961358
7        1 -1.29224233    0.2052271
8        1 -1.98143824   -1.1629534
9        0 -0.37331445    0.5961358
10       1 -0.37331445   -1.9447708
11       0  1.46454131   -0.9284082
12       1  0.31588146   -2.3356795
13       0  1.46454131    0.9088627
14       0  1.46454131    0.8697719
15       1 -0.60304642   -1.9447708
16       1 -0.02871650   -1.9447708
17       0 -0.83277839    0.4788632
18       0 -0.14358248    0.5961358
19       0  1.00507737    0.4006814
20       0 -1.29224233    0.1270453
21       0  1.69427328    0.9088627
22       1 -0.60304642   -1.0456808
23       0 -0.37331445    0.5961358
24       0 -1.06251036    0.6743175
25       0  0.54561343   -2.3356795
26       0  0.08614949    0.4006814
27       0 -0.37331445    0.5961358
28       0  1.46454131    0.4006814
29       0 -0.60304642    0.9088627
30       1 -1.29224233   -1.5538621
31       0 -0.83277839    0.4788632
32       0  0.66047941    0.4006814
33       0 -0.83277839    0.5179540
34       0  1.23480934    0.4006814
35       0  1.00507737    0.7134084
36       0 -0.60304642    0.6352266
37       1 -1.06251036   -1.1629534
38       0  1.00507737    0.6352266
39       0  0.08614949    0.9088627
40       1 -1.29224233   -1.2411351
41       0 -0.14358248    0.5179540
42       0 -0.37331445    0.5961358
43       0  1.00507737    0.9479536
44       0 -0.83277839    0.5961358
45       0 -0.60304642    0.4788632
46       0  1.00507737    0.4006814
47       0 -0.60304642   -0.8502264
48       0  0.77534540    0.8697719
49       0 -0.14358248    0.6743175
50       0  0.43074744    0.5179540
51       0  0.66047941    1.0261353
52       0 -1.75170627    0.5570449
53       0  0.31588146    0.5570449
54       0  1.00507737   -0.3811360
55       0  1.00507737    0.7915901
56       0  2.61320116   -0.6547721
attr(,"assign")
 [1] 0 1 2 3 4 4 4 4 5 6
attr(,"contrasts")
attr(,"contrasts")$fGRAZE
  2 3 4 5
1 0 0 0 0
2 1 0 0 0
3 0 1 0 0
4 0 0 1 0
5 0 0 0 1


$prior_only
[1] 0

attr(,"class")
[1] "standata" "list"    
loyn.brm3 |> stancode()
// generated with brms 2.19.0
functions {
  
}
data {
  int<lower=1> N; // total number of observations
  vector[N] Y; // response variable
  int<lower=1> K; // number of population-level effects
  matrix[N, K] X; // population-level design matrix
  int prior_only; // should the likelihood be ignored?
}
transformed data {
  int Kc = K - 1;
  matrix[N, Kc] Xc; // centered version of X without an intercept
  vector[Kc] means_X; // column means of X before centering
  for (i in 2 : K) {
    means_X[i - 1] = mean(X[ : , i]);
    Xc[ : , i - 1] = X[ : , i] - means_X[i - 1];
  }
}
parameters {
  vector[Kc] b; // population-level effects
  real Intercept; // temporary intercept for centered predictors
  real<lower=0> sigma; // dispersion parameter
}
transformed parameters {
  real lprior = 0; // prior contributions to the log posterior
  lprior += normal_lpdf(b | 0, 1.5);
  lprior += normal_lpdf(Intercept | 3, 0.5);
  lprior += student_t_lpdf(sigma | 3, 0, 2.5)
            - 1 * student_t_lccdf(0 | 3, 0, 2.5);
}
model {
  // likelihood including constants
  if (!prior_only) {
    // initialize linear predictor term
    vector[N] mu = rep_vector(0.0, N);
    mu += Intercept + Xc * b;
    mu = exp(mu);
    target += normal_lpdf(Y | mu, sigma);
  }
  // priors including constants
  target += lprior;
}
generated quantities {
  // actual population-level intercept
  real b_Intercept = Intercept - dot_product(means_X, b);
  // additionally sample draws from priors
  real prior_b = normal_rng(0, 1.5);
  real prior_Intercept = normal_rng(3, 0.5);
  real prior_sigma = student_t_rng(3, 0, 2.5);
  // use rejection sampling for truncated priors
  while (prior_sigma < 0) {
    prior_sigma = student_t_rng(3, 0, 2.5);
  }
}

6 MCMC sampling diagnostics

The bayesplot package offers a range of MCMC diagnostics as well as Posterior Probability Checks (PPC), all of which have a convenient plot() interface. Lets start with the MCMC diagnostics.

See list of available diagnostics by name
available_mcmc()
bayesplot MCMC module:
  mcmc_acf
  mcmc_acf_bar
  mcmc_areas
  mcmc_areas_ridges
  mcmc_combo
  mcmc_dens
  mcmc_dens_chains
  mcmc_dens_overlay
  mcmc_hex
  mcmc_hist
  mcmc_hist_by_chain
  mcmc_intervals
  mcmc_neff
  mcmc_neff_hist
  mcmc_nuts_acceptance
  mcmc_nuts_divergence
  mcmc_nuts_energy
  mcmc_nuts_stepsize
  mcmc_nuts_treedepth
  mcmc_pairs
  mcmc_parcoord
  mcmc_rank_ecdf
  mcmc_rank_hist
  mcmc_rank_overlay
  mcmc_recover_hist
  mcmc_recover_intervals
  mcmc_recover_scatter
  mcmc_rhat
  mcmc_rhat_hist
  mcmc_scatter
  mcmc_trace
  mcmc_trace_highlight
  mcmc_violin

Of these, we will focus on:

  • mcmc_trace: this plots the estimates of each parameter over the post-warmup length of each MCMC chain. Each chain is plotted in a different shade of blue, with each parameter in its own facet. Ideally, each trace should just look like noise without any discernible drift and each of the traces for a specific parameter should look the same (i.e, should not be displaced above or below any other trace for that parameter).
plot(loyn.rstanarm3, plotfun='mcmc_trace')

The chains appear well mixed and very similar

  • acf (auto-correlation function): plots the auto-correlation between successive MCMC sample lags for each parameter and each chain
plot(loyn.rstanarm3, 'acf_bar')
Warning: The `facets` argument of `facet_grid()` is deprecated as of ggplot2 2.2.0.
ℹ Please use the `rows` argument instead.
ℹ The deprecated feature was likely used in the bayesplot package.
  Please report the issue at <https://github.com/stan-dev/bayesplot/issues/>.

There is no evidence of auto-correlation in the MCMC samples

  • Rhat: Rhat is a measure of convergence between the chains. The closer the values are to 1, the more the chains have converged. Values greater than 1.05 indicate a lack of convergence. There will be an Rhat value for each parameter estimated.
plot(loyn.rstanarm3, 'rhat_hist')
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

All Rhat values are below 1.05, suggesting the chains have converged.

  • neff (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).

    If the ratios are low, tightening the priors may help.

plot(loyn.rstanarm3, 'neff_hist')
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Ratios all very high.

More diagnostics
plot(loyn.rstanarm3, 'combo')

plot(loyn.rstanarm3, 'violin')

The rstan package offers a range of MCMC diagnostics. Lets start with the MCMC diagnostics.

Of these, we will focus on:

  • stan_trace: this plots the estimates of each parameter over the post-warmup length of each MCMC chain. Each chain is plotted in a different colour, with each parameter in its own facet. Ideally, each trace should just look like noise without any discernible drift and each of the traces for a specific parameter should look the same (i.e, should not be displaced above or below any other trace for that parameter).
stan_trace(loyn.rstanarm3)

The chains appear well mixed and very similar

  • stan_acf (auto-correlation function): plots the auto-correlation between successive MCMC sample lags for each parameter and each chain
stan_ac(loyn.rstanarm3) 

There is no evidence of auto-correlation in the MCMC samples

  • stan_rhat: Rhat is a scale reduction factor measure of convergence between the chains. The closer the values are to 1, the more the chains have converged. Values greater than 1.05 indicate a lack of convergence. There will be an Rhat value for each parameter estimated.
stan_rhat(loyn.rstanarm3) 
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

All Rhat values are below 1.05, suggesting the chains have converged.

  • stan_ess (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).

    If the ratios are low, tightening the priors may help.

stan_ess(loyn.rstanarm3)
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Ratios all very high.

stan_dens(loyn.rstanarm3, separate_chains = TRUE)

The ggmean package also has a set of MCMC diagnostic functions. Lets start with the MCMC diagnostics.

Of these, we will focus on:

  • ggs_traceplot: this plots the estimates of each parameter over the post-warmup length of each MCMC chain. Each chain is plotted in a different colour, with each parameter in its own facet. Ideally, each trace should just look like noise without any discernible drift and each of the traces for a specific parameter should look the same (i.e, should not be displaced above or below any other trace for that parameter).
loyn.ggs <- ggs(loyn.rstanarm3)
ggs_traceplot(loyn.ggs)

The chains appear well mixed and very similar

  • gss_autocorrelation (autocorrelation function): plots the autocorrelation between successive MCMC sample lags for each parameter and each chain
ggs_autocorrelation(loyn.ggs)

There is no evidence of autocorrelation in the MCMC samples

  • stan_rhat: Rhat is a scale reduction factor measure of convergence between the chains. The closer the values are to 1, the more the chains have converged. Values greater than 1.05 indicate a lack of convergence. There will be an Rhat value for each parameter estimated.
ggs_Rhat(loyn.ggs)

All Rhat values are below 1.05, suggesting the chains have converged.

  • stan_ess (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).

    If the ratios are low, tightening the priors may help.

ggs_effective(loyn.ggs)
Warning: Returning more (or less) than 1 row per `summarise()` group was deprecated in
dplyr 1.1.0.
ℹ Please use `reframe()` instead.
ℹ When switching from `summarise()` to `reframe()`, remember that `reframe()`
  always returns an ungrouped data frame and adjust accordingly.
ℹ The deprecated feature was likely used in the ggmcmc package.
  Please report the issue at <https://github.com/xfim/ggmcmc/issues/>.

Ratios all very high.

More diagnostics
ggs_crosscorrelation(loyn.ggs)

ggs_grb(loyn.ggs)

The bayesplot package offers a range of MCMC diagnostics as well as Posterior Probability Checks (PPC), all of which have a convenient plot() interface. Lets start with the MCMC diagnostics.

See list of available diagnostics by name
available_mcmc()
bayesplot MCMC module:
  mcmc_acf
  mcmc_acf_bar
  mcmc_areas
  mcmc_areas_ridges
  mcmc_combo
  mcmc_dens
  mcmc_dens_chains
  mcmc_dens_overlay
  mcmc_hex
  mcmc_hist
  mcmc_hist_by_chain
  mcmc_intervals
  mcmc_neff
  mcmc_neff_hist
  mcmc_nuts_acceptance
  mcmc_nuts_divergence
  mcmc_nuts_energy
  mcmc_nuts_stepsize
  mcmc_nuts_treedepth
  mcmc_pairs
  mcmc_parcoord
  mcmc_rank_ecdf
  mcmc_rank_hist
  mcmc_rank_overlay
  mcmc_recover_hist
  mcmc_recover_intervals
  mcmc_recover_scatter
  mcmc_rhat
  mcmc_rhat_hist
  mcmc_scatter
  mcmc_trace
  mcmc_trace_highlight
  mcmc_violin

Of these, we will focus on:

  • mcmc_trace: this plots the estimates of each parameter over the post-warmup length of each MCMC chain. Each chain is plotted in a different shade of blue, with each parameter in its own facet. Ideally, each trace should just look like noise without any discernible drift and each of the traces for a specific parameter should look the same (i.e, should not be displaced above or below any other trace for that parameter).
loyn.brm3 |> mcmc_plot(type = 'trace')
No divergences to plot.

The chains appear well mixed and very similar

  • acf (autocorrelation function): plots the autocorrelation between successive MCMC sample lags for each parameter and each chain
loyn.brm3 |> mcmc_plot(type = 'acf_bar')
Warning: The `facets` argument of `facet_grid()` is deprecated as of ggplot2 2.2.0.
ℹ Please use the `rows` argument instead.
ℹ The deprecated feature was likely used in the bayesplot package.
  Please report the issue at <https://github.com/stan-dev/bayesplot/issues/>.

There is no evidence of autocorrelation in the MCMC samples

  • Rhat: Rhat is a measure of convergence between the chains. The closer the values are to 1, the more the chains have converged. Values greater than 1.05 indicate a lack of convergence. There will be an Rhat value for each parameter estimated.
loyn.brm3 |> mcmc_plot(type = 'rhat_hist')
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

All Rhat values are below 1.05, suggesting the chains have converged.

  • neff (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).

    If the ratios are low, tightening the priors may help.

loyn.brm3 |> mcmc_plot(type = 'neff_hist')
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Ratios all very high.

More diagnostics
loyn.brm3 |> mcmc_plot(type = 'combo')

loyn.brm3 |> mcmc_plot(type = 'violin')

The rstan package offers a range of MCMC diagnostics. Lets start with the MCMC diagnostics.

Of these, we will focus on:

  • stan_trace: this plots the estimates of each parameter over the post-warmup length of each MCMC chain. Each chain is plotted in a different colour, with each parameter in its own facet. Ideally, each trace should just look like noise without any discernible drift and each of the traces for a specific parameter should look the same (i.e, should not be displaced above or below any other trace for that parameter).
loyn.brm3$fit |> stan_trace()
'pars' not specified. Showing first 10 parameters by default.

The chains appear well mixed and very similar

  • stan_acf (autocorrelation function): plots the autocorrelation between successive MCMC sample lags for each parameter and each chain
loyn.brm3$fit |> stan_ac() 
'pars' not specified. Showing first 10 parameters by default.

There is no evidence of autocorrelation in the MCMC samples

  • stan_rhat: Rhat is a scale reduction factor measure of convergence between the chains. The closer the values are to 1, the more the chains have converged. Values greater than 1.05 indicate a lack of convergence. There will be an Rhat value for each parameter estimated.
loyn.brm3$fit |> stan_rhat() 
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

All Rhat values are below 1.05, suggesting the chains have converged.

  • stan_ess (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).

    If the ratios are low, tightening the priors may help.

loyn.brm3$fit |> stan_ess()
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Ratios all very high.

loyn.brm3$fit |> stan_dens(separate_chains = TRUE)
'pars' not specified. Showing first 10 parameters by default.

The ggmean package also has a set of MCMC diagnostic functions. Lets start with the MCMC diagnostics.

Of these, we will focus on:

  • ggs_traceplot: this plots the estimates of each parameter over the post-warmup length of each MCMC chain. Each chain is plotted in a different colour, with each parameter in its own facet. Ideally, each trace should just look like noise without any discernible drift and each of the traces for a specific parameter should look the same (i.e, should not be displaced above or below any other trace for that parameter).
loyn.ggs <- loyn.brm3 |> ggs(inc_warmup = FALSE, burnin = FALSE)
loyn.ggs |> ggs_traceplot()

The chains appear well mixed and very similar

  • gss_autocorrelation (autocorrelation function): plots the autocorrelation between successive MCMC sample lags for each parameter and each chain
loyn.ggs |> ggs_autocorrelation()

There is no evidence of autocorrelation in the MCMC samples

  • stan_rhat: Rhat is a scale reduction factor measure of convergence between the chains. The closer the values are to 1, the more the chains have converged. Values greater than 1.05 indicate a lack of convergence. There will be an Rhat value for each parameter estimated.
loyn.ggs |> ggs_Rhat()

All Rhat values are below 1.05, suggesting the chains have converged.

  • stan_ess (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).

    If the ratios are low, tightening the priors may help.

loyn.ggs |> ggs_effective()
Warning: Returning more (or less) than 1 row per `summarise()` group was deprecated in
dplyr 1.1.0.
ℹ Please use `reframe()` instead.
ℹ When switching from `summarise()` to `reframe()`, remember that `reframe()`
  always returns an ungrouped data frame and adjust accordingly.
ℹ The deprecated feature was likely used in the ggmcmc package.
  Please report the issue at <https://github.com/xfim/ggmcmc/issues/>.

Ratios all very high.

More diagnostics
loyn.ggs |> ggs_crosscorrelation()

loyn.ggs |> ggs_grb()

7 Model validation

Post predictive checks provide additional diagnostics about the fit of the model. Specifically, they provide a comparison between predictions drawn from the model and the observed data used to train the model.

See list of available diagnostics by name
available_ppc()
bayesplot PPC module:
  ppc_bars
  ppc_bars_grouped
  ppc_boxplot
  ppc_dens
  ppc_dens_overlay
  ppc_dens_overlay_grouped
  ppc_ecdf_overlay
  ppc_ecdf_overlay_grouped
  ppc_error_binned
  ppc_error_hist
  ppc_error_hist_grouped
  ppc_error_scatter
  ppc_error_scatter_avg
  ppc_error_scatter_avg_grouped
  ppc_error_scatter_avg_vs_x
  ppc_freqpoly
  ppc_freqpoly_grouped
  ppc_hist
  ppc_intervals
  ppc_intervals_grouped
  ppc_km_overlay
  ppc_km_overlay_grouped
  ppc_loo_intervals
  ppc_loo_pit
  ppc_loo_pit_overlay
  ppc_loo_pit_qq
  ppc_loo_ribbon
  ppc_pit_ecdf
  ppc_pit_ecdf_grouped
  ppc_ribbon
  ppc_ribbon_grouped
  ppc_rootogram
  ppc_scatter
  ppc_scatter_avg
  ppc_scatter_avg_grouped
  ppc_stat
  ppc_stat_2d
  ppc_stat_freqpoly
  ppc_stat_freqpoly_grouped
  ppc_stat_grouped
  ppc_violin_grouped
  • dens_overlay: plots the density distribution of the observed data (black line) overlayed on top of 50 density distributions generated from draws from the model (light blue). Ideally, the 50 realisations should be roughly consistent with the observed data.
pp_check(loyn.rstanarm3,  plotfun='dens_overlay')

The model draws appear deviate from the observed data.

  • error_scatter_avg: this plots the observed values against the average residuals. Similar to a residual plot, we do not want to see any patterns in this plot. There is some pattern remaining in these residuals.
pp_check(loyn.rstanarm3, plotfun='error_scatter_avg')

The predictive error seems to be related to the predictor - the model performs poorest at higher mussel clump areas.

  • error_scatter_avg_vs_x: this is similar to a regular residual plot and as such should be interpreted as such. Again, this is not interpretable for binary data.
pp_check(loyn.rstanarm3, x=loyn$AREA, plotfun='error_scatter_avg_vs_x')

  • intervals: plots the observed data overlayed on top of posterior predictions associated with each level of the predictor. Ideally, the observed data should all fall within the predictive intervals.
pp_check(loyn.rstanarm3, x=loyn$AREA, plotfun='intervals')

The modelled predictions seem to underestimate the uncertainty with increasing mussel clump area.

  • ribbon: this is just an alternative way of expressing the above plot.
pp_check(loyn.rstanarm3, x=loyn$AREA, plotfun='ribbon')

The shinystan package allows the full suite of MCMC diagnostics and posterior predictive checks to be accessed via a web interface.

#library(shinystan)
#launch_shinystan(loyn.rstanarm3)

DHARMa residuals provide very useful diagnostics. Unfortunately, we cannot directly use the simulateResiduals() function to generate the simulated residuals. However, if we are willing to calculate some of the components yourself, we can still obtain the simulated residuals from the fitted stan model.

We need to supply:

  • simulated (predicted) responses associated with each observation.
  • observed values
  • fitted (predicted) responses (averaged) associated with each observation
preds <- posterior_predict(loyn.rstanarm3,  ndraws=250,  summary=FALSE)
loyn.resids <- createDHARMa(simulatedResponse = t(preds),
                            observedResponse = loyn$ABUND,
                            fittedPredictedResponse = apply(preds, 2, median),
                            integerResponse = FALSE)
plot(loyn.resids)

Conclusions:

  • the simulated residuals suggest a general lack of fit due to over dispersion and outliers
  • perhaps we should explore a negative binomial model

Post predictive checks provide additional diagnostics about the fit of the model. Specifically, they provide a comparison between predictions drawn from the model and the observed data used to train the model.

See list of available diagnostics by name
available_ppc()
bayesplot PPC module:
  ppc_bars
  ppc_bars_grouped
  ppc_boxplot
  ppc_dens
  ppc_dens_overlay
  ppc_dens_overlay_grouped
  ppc_ecdf_overlay
  ppc_ecdf_overlay_grouped
  ppc_error_binned
  ppc_error_hist
  ppc_error_hist_grouped
  ppc_error_scatter
  ppc_error_scatter_avg
  ppc_error_scatter_avg_grouped
  ppc_error_scatter_avg_vs_x
  ppc_freqpoly
  ppc_freqpoly_grouped
  ppc_hist
  ppc_intervals
  ppc_intervals_grouped
  ppc_km_overlay
  ppc_km_overlay_grouped
  ppc_loo_intervals
  ppc_loo_pit
  ppc_loo_pit_overlay
  ppc_loo_pit_qq
  ppc_loo_ribbon
  ppc_pit_ecdf
  ppc_pit_ecdf_grouped
  ppc_ribbon
  ppc_ribbon_grouped
  ppc_rootogram
  ppc_scatter
  ppc_scatter_avg
  ppc_scatter_avg_grouped
  ppc_stat
  ppc_stat_2d
  ppc_stat_freqpoly
  ppc_stat_freqpoly_grouped
  ppc_stat_grouped
  ppc_violin_grouped
  • dens_overlay: plots the density distribution of the observed data (black line) overlayed on top of 50 density distributions generated from draws from the model (light blue). Ideally, the 50 realisations should be roughly consistent with the observed data.
loyn.brm3 |> pp_check(type = 'dens_overlay', ndraws = 100)

The model draws appear deviate from the observed data.

  • error_scatter_avg: this plots the observed values against the average residuals. Similar to a residual plot, we do not want to see any patterns in this plot. There is some pattern remaining in these residuals.
loyn.brm3 |> pp_check(type = 'error_scatter_avg')
Using all posterior draws for ppc type 'error_scatter_avg' by default.

The predictive error seems to be related to the predictor - the model performs poorest at higher mussel clump areas.

  • error_scatter_avg_vs_x: this is similar to a regular residual plot and as such should be interpreted as such. Again, this is not interpretable for binary data.
loyn.brm3 |> pp_check(x = 'AREA', type = 'error_scatter_avg_vs_x')
Using all posterior draws for ppc type 'error_scatter_avg_vs_x' by default.

  • intervals: plots the observed data overlayed on top of posterior predictions associated with each level of the predictor. Ideally, the observed data should all fall within the predictive intervals.
loyn.brm3 |> pp_check(x = 'AREA', type = 'intervals')
Using all posterior draws for ppc type 'intervals' by default.

The modelled predictions seem to underestimate the uncertainty with increasing mussel clump area.

  • ribbon: this is just an alternative way of expressing the above plot.
loyn.brm3 |> pp_check(x = 'AREA', type = 'ribbon')
Using all posterior draws for ppc type 'ribbon' by default.

The shinystan package allows the full suite of MCMC diagnostics and posterior predictive checks to be accessed via a web interface.

#library(shinystan)
#launch_shinystan(loyn.brm3)

DHARMa residuals provide very useful diagnostics. Unfortunately, we cannot directly use the simulateResiduals() function to generate the simulated residuals. However, if we are willing to calculate some of the components yourself, we can still obtain the simulated residuals from the fitted stan model.

We need to supply:

  • simulated (predicted) responses associated with each observation.
  • observed values
  • fitted (predicted) responses (averaged) associated with each observation
preds <- loyn.brm3 |> posterior_predict(ndraws = 250,  summary = FALSE)
loyn.resids <- createDHARMa(simulatedResponse = t(preds),
                            observedResponse = loyn$ABUND,
                            fittedPredictedResponse = apply(preds, 2, median),
                            integerResponse = FALSE)
loyn.resids |> plot()

loyn.resids <- make_brms_dharma_res(loyn.brm3, integerResponse = FALSE)
wrap_elements(~testUniformity(loyn.resids)) +
               wrap_elements(~plotResiduals(loyn.resids, form = factor(rep(1, nrow(loyn))))) +
               wrap_elements(~plotResiduals(loyn.resids, quantreg = TRUE)) +
               wrap_elements(~testDispersion(loyn.resids))

Conclusions:

  • the simulated residuals suggest a general lack of fit due to over-dispersion and outliers
  • perhaps we should explore a negative binomial model

8 Partial effects plots

loyn.rstanarm3 |> ggpredict() |> plot(add.data=TRUE, facet=TRUE) 

#loyn.rstanarm3 |> ggemmeans(~AREA,  type='fixed') |> plot(add.data=TRUE) + scale_y_log10()
loyn.rstanarm3 |> fitted_draws(newdata=loyn) |>
  median_hdci() |>
  ggplot(aes(x=AREA, y=.value)) +
  geom_ribbon(aes(ymin=.lower, ymax=.upper), fill='blue', alpha=0.3) + 
  geom_line() +
  geom_point(data=loyn,  aes(y=ABUND,  x=AREA)) +
  scale_y_log10()
Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
Use [add_]epred_draws() to get the expectation of the posterior predictive.
Use [add_]linpred_draws() to get the distribution of the linear predictor.
For example, you used [add_]fitted_draws(..., scale = "response"), which
means you most likely want [add_]epred_draws(...).

loyn.brm3 |>
  conditional_effects() |>
  plot(ask = FALSE, points = TRUE, plot = FALSE) |>
  wrap_plots()

loyn.brm3 |>
    conditional_effects() |>
    plot(ask = FALSE, points = TRUE, plot = FALSE) |>
    wrap_plots() &
    scale_y_log10()

g <- loyn.brm3 |>
  conditional_effects() |>
  plot(ask = FALSE, points = TRUE, plot = FALSE) 
library(patchwork)
length(g)
[1] 6
(g[[1]] + scale_x_log10()) +
    (g[[2]] + scale_x_log10()) +
    (g[[3]] + scale_x_log10()) +
    g[[4]] +
    g[[5]] +
    g[[6]]

loyn.brm3 |>
    ggpredict() |>
    plot(add.data=TRUE) |>
    wrap_plots()

loyn.brm3 |>
    ggpredict() |>
    plot(add.data=TRUE) |>
    wrap_plots() &
    scale_y_log10()
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Warning: Transformation introduced infinite values in continuous y-axis
Transformation introduced infinite values in continuous y-axis
Transformation introduced infinite values in continuous y-axis
Transformation introduced infinite values in continuous y-axis
Transformation introduced infinite values in continuous y-axis
Transformation introduced infinite values in continuous y-axis
Transformation introduced infinite values in continuous y-axis

This will be slow …

loyn.brm3 |>
    ggemmeans("AREA[0:1000]") |>
    plot(add.data=TRUE) |>
    wrap_plots() &
    scale_y_log10() &
    scale_x_log10()
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.

loyn.brm3 |>
    epred_draws(newdata = loyn) |>
    median_hdci(.epred) |>
    ggplot(aes(x = AREA, y = .epred, colour = fGRAZE, fill = fGRAZE)) +
    geom_ribbon(aes(ymin = .lower, ymax = .upper), colour = NA, alpha = 0.3) + 
    geom_line() +
    geom_point(data = loyn,  aes(y = ABUND,  x = AREA)) +
    scale_y_log10() +
    scale_x_log10()

9 Model investigation

The summary() method generates simple summaries (mean, standard deviation as well as 10, 50 and 90 percentiles).

summary(loyn.rstanarm3)

Model Info:
 function:     stan_glm
 family:       gaussian [log]
 formula:      ABUND ~ scale(log(DIST)) + scale(log(LDIST)) + scale(log(AREA)) + 
       fGRAZE + scale(ALT) + scale(YR.ISOL)
 algorithm:    sampling
 sample:       1500 (posterior sample size)
 priors:       see help('prior_summary')
 observations: 56
 predictors:   10

Estimates:
                    mean   sd   10%   50%   90%
(Intercept)        3.0    0.1  2.9   3.0   3.2 
scale(log(DIST))   0.0    0.1 -0.1   0.0   0.1 
scale(log(LDIST))  0.0    0.1  0.0   0.1   0.1 
scale(log(AREA))   0.2    0.1  0.1   0.2   0.3 
fGRAZE2            0.0    0.2 -0.2   0.0   0.2 
fGRAZE3            0.0    0.1 -0.1   0.0   0.2 
fGRAZE4            0.0    0.2 -0.2   0.0   0.2 
fGRAZE5           -1.1    0.4 -1.5  -1.1  -0.6 
scale(ALT)         0.0    0.1 -0.1   0.0   0.1 
scale(YR.ISOL)     0.0    0.1 -0.1   0.0   0.1 
sigma              6.6    0.7  5.8   6.5   7.5 

Fit Diagnostics:
           mean   sd   10%   50%   90%
mean_PPD 19.3    1.2 17.7  19.4  20.9 

The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).

MCMC diagnostics
                  mcse Rhat n_eff
(Intercept)       0.0  1.0  1305 
scale(log(DIST))  0.0  1.0  1288 
scale(log(LDIST)) 0.0  1.0  1497 
scale(log(AREA))  0.0  1.0  1502 
fGRAZE2           0.0  1.0  1507 
fGRAZE3           0.0  1.0  1255 
fGRAZE4           0.0  1.0  1434 
fGRAZE5           0.0  1.0  1477 
scale(ALT)        0.0  1.0  1374 
scale(YR.ISOL)    0.0  1.0  1497 
sigma             0.0  1.0  1377 
mean_PPD          0.0  1.0  1531 
log-posterior     0.1  1.0  1412 

For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).

Conclusions:

  • in the Model Info, we are informed that the total MCMC posterior sample size is 1500 and that there were 56 raw observations.
  • the estimated intercept (expected bird abundance when grazing intensity is equal to 1 and all of the continuous predictors are at their respective averages) is 3.05. This is the mean of the posterior distribution for this parameter. If we back-transform this to the response scale, this becomes 21.02.
  • the estimated slope associated with DIST (rate at which the abundance of birds changes per 1 unit change in log-transformed Distance to nearest patch holding other continuous predictors constant and grazing intensity at level 1), is -0.01 (mean) or -0.08 (median) with a standard deviation of 0. The 90% credibility intervals indicate that we are 90% confident that the slope is between -0.01 and -0.01 - e.g. there is no evidence of a significant trend.
  • associated with each of the continuous predictors is a partial slope. Each partial slope is the rate of change between bird abundance and the associated predictor (on the log scale due to the link and based on 1 unit change in the predictor on the scale of the predictor). For example, for every one unit change in centred log patch Area, bird abundance is expected to increase by (log) 0.22.
  • if we back transform this slope (by exponentiation), we get a partial slope for centred log Area of 1.24. This is interpreted as - for every 1 unit increase in (scaled log) Area, the bird abundance is expected to increase 1.24 fold. That is, there is a 24 percent increase per 1 unit increase in centred log Area.
  • Rhat and number of effective samples for each parameter are also provided as MCMC diagnostics and all look good.
loyn.rstanarm3$stanfit |>
    summarise_draws(median,
                    HDInterval::hdi,
                    rhat, length, ess_bulk, ess_tail)
# A tibble: 13 × 8
   variable              median    lower    upper  rhat length ess_bulk ess_tail
   <chr>                  <num>    <num>    <num> <num>  <num>    <num>    <num>
 1 (Intercept)          3.05e+0  2.84e+0    3.29  1.00    1500    1323.    1221.
 2 scale(log(DIST))    -1.25e-2 -1.22e-1    0.108 1.00    1500    1288.    1082.
 3 scale(log(LDIST))    5.12e-2 -8.02e-2    0.161 1.00    1500    1482.    1414.
 4 scale(log(AREA))     2.13e-1  9.47e-2    0.333 0.999   1500    1518.    1334.
 5 fGRAZE2              3.61e-2 -2.80e-1    0.337 1.00    1500    1546.    1498.
 6 fGRAZE3              3.78e-2 -2.07e-1    0.330 1.00    1500    1246.    1389.
 7 fGRAZE4              1.10e-2 -3.14e-1    0.322 1.00    1500    1417.    1216.
 8 fGRAZE5             -1.06e+0 -1.79e+0   -0.438 1.00    1500    1494.    1419.
 9 scale(ALT)          -2.00e-3 -1.03e-1    0.104 1.00    1500    1384.    1544.
10 scale(YR.ISOL)       1.91e-4 -1.55e-1    0.157 1.00    1500    1485.    1350.
11 sigma                6.49e+0  5.37e+0    8.02  0.999   1500    1386.    1428.
12 mean_PPD             1.94e+1  1.67e+1   21.5   1.00    1500    1529.    1236.
13 log-posterior       -1.96e+2 -2.02e+2 -192.    1.00    1500    1413.    1383.

We can also alter the CI level.

loyn.rstanarm3$stanfit |>
    summarise_draws(median,
                    ~HDInterval::hdi(.x, credMass = 0.9),
                    rhat, length, ess_bulk, ess_tail)
# A tibble: 13 × 8
   variable              median    lower    upper  rhat length ess_bulk ess_tail
   <chr>                  <num>    <num>    <num> <num>  <num>    <num>    <num>
 1 (Intercept)          3.05e+0  2.84e+0  3.23e+0 1.00    1500    1323.    1221.
 2 scale(log(DIST))    -1.25e-2 -1.07e-1  8.53e-2 1.00    1500    1288.    1082.
 3 scale(log(LDIST))    5.12e-2 -4.48e-2  1.55e-1 1.00    1500    1482.    1414.
 4 scale(log(AREA))     2.13e-1  1.16e-1  3.16e-1 0.999   1500    1518.    1334.
 5 fGRAZE2              3.61e-2 -2.16e-1  3.04e-1 1.00    1500    1546.    1498.
 6 fGRAZE3              3.78e-2 -1.73e-1  2.74e-1 1.00    1500    1246.    1389.
 7 fGRAZE4              1.10e-2 -2.72e-1  2.61e-1 1.00    1500    1417.    1216.
 8 fGRAZE5             -1.06e+0 -1.60e+0 -4.94e-1 1.00    1500    1494.    1419.
 9 scale(ALT)          -2.00e-3 -8.37e-2  8.54e-2 1.00    1500    1384.    1544.
10 scale(YR.ISOL)       1.91e-4 -1.31e-1  1.26e-1 1.00    1500    1485.    1350.
11 sigma                6.49e+0  5.49e+0  7.65e+0 0.999   1500    1386.    1428.
12 mean_PPD             1.94e+1  1.73e+1  2.13e+1 1.00    1500    1529.    1236.
13 log-posterior       -1.96e+2 -2.01e+2 -1.92e+2 1.00    1500    1413.    1383.

And on a ratio scale

loyn.rstanarm3$stanfit |>
    summarise_draws(
        ~ median(exp(.x)),
        ~HDInterval::hdi(exp(.x)),
        rhat, length, ess_bulk, ess_tail)
# A tibble: 13 × 8
   variable  `~median(exp(.x))`    lower    upper  rhat length ess_bulk ess_tail
   <chr>                  <num>    <num>    <num> <num>  <num>    <num>    <num>
 1 (Interce…           2.11e+ 1 1.66e+ 1 2.62e+ 1 1.00    1500    1323.    1221.
 2 scale(lo…           9.88e- 1 8.80e- 1 1.11e+ 0 1.00    1500    1288.    1082.
 3 scale(lo…           1.05e+ 0 9.23e- 1 1.17e+ 0 1.00    1500    1482.    1414.
 4 scale(lo…           1.24e+ 0 1.10e+ 0 1.39e+ 0 0.999   1500    1518.    1334.
 5 fGRAZE2             1.04e+ 0 7.56e- 1 1.40e+ 0 1.00    1500    1546.    1498.
 6 fGRAZE3             1.04e+ 0 7.80e- 1 1.34e+ 0 1.00    1500    1246.    1389.
 7 fGRAZE4             1.01e+ 0 7.05e- 1 1.35e+ 0 1.00    1500    1417.    1216.
 8 fGRAZE5             3.47e- 1 1.43e- 1 6.05e- 1 1.00    1500    1494.    1419.
 9 scale(AL…           9.98e- 1 8.95e- 1 1.10e+ 0 1.00    1500    1384.    1544.
10 scale(YR…           1.00e+ 0 8.50e- 1 1.16e+ 0 1.00    1500    1485.    1350.
11 sigma               6.58e+ 2 1.34e+ 2 2.44e+ 3 0.999   1500    1386.    1428.
12 mean_PPD            2.59e+ 8 1.59e+ 6 1.81e+ 9 1.00    1500    1529.    1236.
13 log-post…           6.32e-86 1.58e-93 1.70e-84 1.00    1500    1413.    1383.
loyn.rstanarm3 |> tidy_draws() 
# A tibble: 1,500 × 20
   .chain .iteration .draw `(Intercept)` `scale(log(DIST))` `scale(log(LDIST))`
    <int>      <int> <int>         <dbl>              <dbl>               <dbl>
 1      1          1     1          2.83           -0.0313              0.133  
 2      1          2     2          3.14            0.0623             -0.00792
 3      1          3     3          3.04            0.0339              0.00387
 4      1          4     4          3.05            0.0593              0.0970 
 5      1          5     5          3.25            0.0645              0.115  
 6      1          6     6          3.10            0.0190             -0.0320 
 7      1          7     7          2.97           -0.0420             -0.0309 
 8      1          8     8          3.06           -0.00284             0.0884 
 9      1          9     9          2.79            0.0817             -0.00417
10      1         10    10          3.02            0.102               0.0491 
# ℹ 1,490 more rows
# ℹ 14 more variables: `scale(log(AREA))` <dbl>, fGRAZE2 <dbl>, fGRAZE3 <dbl>,
#   fGRAZE4 <dbl>, fGRAZE5 <dbl>, `scale(ALT)` <dbl>, `scale(YR.ISOL)` <dbl>,
#   sigma <dbl>, accept_stat__ <dbl>, stepsize__ <dbl>, treedepth__ <dbl>,
#   n_leapfrog__ <dbl>, divergent__ <dbl>, energy__ <dbl>
loyn.rstanarm3$stanfit |> as_draws_df()
# A draws_df: 500 iterations, 3 chains, and 13 variables
   (Intercept) scale(log(DIST)) scale(log(LDIST)) scale(log(AREA)) fGRAZE2
1          2.8          -0.0313            0.1326            0.276   0.214
2          3.1           0.0623           -0.0079            0.182  -0.146
3          3.0           0.0339            0.0039            0.260   0.185
4          3.1           0.0593            0.0970            0.123  -0.031
5          3.2           0.0645            0.1151            0.095   0.059
6          3.1           0.0190           -0.0320            0.242   0.113
7          3.0          -0.0420           -0.0309            0.354   0.102
8          3.1          -0.0028            0.0884            0.176   0.129
9          2.8           0.0817           -0.0042            0.372   0.310
10         3.0           0.1020            0.0491            0.061  -0.021
   fGRAZE3 fGRAZE4 fGRAZE5
1    0.238   0.328   -0.44
2   -0.038   0.073   -1.14
3    0.047  -0.069   -1.09
4    0.096  -0.040   -0.63
5   -0.286  -0.138   -1.51
6    0.010   0.076   -1.29
7    0.088   0.081   -0.62
8    0.113   0.089   -1.27
9    0.230   0.160   -0.56
10  -0.035  -0.122   -0.69
# ... with 1490 more draws, and 5 more variables
# ... hidden reserved variables {'.chain', '.iteration', '.draw'}
loyn.rstanarm3$stanfit |>
  as_draws_df() |>
  summarise_draws(
    "median",
    ~ HDInterval::hdi(.x),
    "rhat",
    "ess_bulk"
  )
# A tibble: 13 × 6
   variable               median     lower    upper  rhat ess_bulk
   <chr>                   <num>     <num>    <num> <num>    <num>
 1 (Intercept)          3.05        2.84      3.29  1.00     1323.
 2 scale(log(DIST))    -0.0125     -0.122     0.108 1.00     1288.
 3 scale(log(LDIST))    0.0512     -0.0802    0.161 1.00     1482.
 4 scale(log(AREA))     0.213       0.0947    0.333 0.999    1518.
 5 fGRAZE2              0.0361     -0.280     0.337 1.00     1546.
 6 fGRAZE3              0.0378     -0.207     0.330 1.00     1246.
 7 fGRAZE4              0.0110     -0.314     0.322 1.00     1417.
 8 fGRAZE5             -1.06       -1.79     -0.438 1.00     1494.
 9 scale(ALT)          -0.00200    -0.103     0.104 1.00     1384.
10 scale(YR.ISOL)       0.000191   -0.155     0.157 1.00     1485.
11 sigma                6.49        5.37      8.02  0.999    1386.
12 mean_PPD            19.4        16.7      21.5   1.00     1529.
13 log-posterior     -196.       -202.     -192.    1.00     1413.
tidyMCMC(loyn.rstanarm3$stanfit, estimate.method='median',  conf.int=TRUE,
         conf.method='HPDinterval',  rhat=TRUE, ess=TRUE)
# A tibble: 13 × 7
   term                 estimate std.error  conf.low conf.high  rhat   ess
   <chr>                   <dbl>     <dbl>     <dbl>     <dbl> <dbl> <int>
 1 (Intercept)          3.05        0.118     2.84       3.29  1.00   1305
 2 scale(log(DIST))    -0.0125      0.0593   -0.122      0.108 1.00   1288
 3 scale(log(LDIST))    0.0512      0.0610   -0.0802     0.161 1.00   1497
 4 scale(log(AREA))     0.213       0.0625    0.0947     0.333 0.999  1502
 5 fGRAZE2              0.0361      0.159    -0.280      0.337 1.00   1507
 6 fGRAZE3              0.0378      0.138    -0.207      0.330 1.00   1255
 7 fGRAZE4              0.0110      0.160    -0.314      0.322 1.00   1434
 8 fGRAZE5             -1.06        0.353    -1.79      -0.438 0.999  1477
 9 scale(ALT)          -0.00200     0.0530   -0.103      0.104 0.999  1374
10 scale(YR.ISOL)       0.000191    0.0794   -0.155      0.157 1.00   1497
11 sigma                6.49        0.696     5.37       8.02  0.999  1377
12 mean_PPD            19.4         1.24     16.7       21.5   1.00   1531
13 log-posterior     -196.          2.78   -202.      -192.    1.00   1412

Conclusions:

  • the estimated intercept (expected bird abundance when grazing intensity is equal to 1 and all of the continuous predictors are at their respective averages) is NA. This is the mean of the posterior distribution for this parameter. If we back-transform this to the response scale, this becomes NA.
  • the estimated slope associated with DIST (rate at which the abundance of birds changes per 1 unit change in log-transformed Distance to nearest patch holding other continuous predictors constant and grazing intensity at level 1), is NA (mean) or -0.12 with a standard deviation of -0.01. The 90% credibility intervals indicate that we are 90% confident that the slope is between NA and 0.11 - e.g. there is no evidence of a significant trend.
  • associated with each of the continuous predictors is a partial slope. Each partial slope is the rate of change between bird abundance and the associated predictor (on the log scale due to the link and based on 1 unit change in the predictor on the scale of the predictor). For example, for every one unit change in centred log patch Area, bird abundance is expected to increase by (log) NA.
  • if we back transform this slope (by exponentiation), we get a partial slope for centred log Area of NA. This is interpreted as - for every 1 unit increase in (scaled log) Area, the bird abundance is expected to increase NA fold. That is, there is a NA percent increase per 1 unit increase in centred log Area.
  • Rhat and number of effective samples for each parameter are also provided as MCMC diagnostics and all look good.
loyn.rstanarm3 |> get_variables()
 [1] "(Intercept)"       "scale(log(DIST))"  "scale(log(LDIST))"
 [4] "scale(log(AREA))"  "fGRAZE2"           "fGRAZE3"          
 [7] "fGRAZE4"           "fGRAZE5"           "scale(ALT)"       
[10] "scale(YR.ISOL)"    "sigma"             "accept_stat__"    
[13] "stepsize__"        "treedepth__"       "n_leapfrog__"     
[16] "divergent__"       "energy__"         
loyn.draw <- loyn.rstanarm3 |> gather_draws(`.Intercept.*|.*AREA.*|.*DIST.*|.*GRAZE.*|.*ALT.*|.*YR.*`,  regex=TRUE)
loyn.draw
# A tibble: 15,000 × 5
# Groups:   .variable [10]
   .chain .iteration .draw .variable   .value
    <int>      <int> <int> <chr>        <dbl>
 1      1          1     1 (Intercept)   2.83
 2      1          2     2 (Intercept)   3.14
 3      1          3     3 (Intercept)   3.04
 4      1          4     4 (Intercept)   3.05
 5      1          5     5 (Intercept)   3.25
 6      1          6     6 (Intercept)   3.10
 7      1          7     7 (Intercept)   2.97
 8      1          8     8 (Intercept)   3.06
 9      1          9     9 (Intercept)   2.79
10      1         10    10 (Intercept)   3.02
# ℹ 14,990 more rows
loyn.rstanarm3 |> plot(plotfun='mcmc_intervals') 

loyn.rstanarm3 |> 
  gather_draws(`.*AREA.*|.*DIST.*|.*GRAZE.*|.*ALT.*|.*YR.*`, regex=TRUE) |> 
  ggplot() + 
  stat_halfeye(aes(x=.value,  y=.variable)) +
  facet_wrap(~.variable, scales='free')

loyn.rstanarm3 |> 
  gather_draws(`.*AREA.*|.*DIST.*|.*GRAZE.*|.*ALT.*|.*YR.*`, regex=TRUE) |> 
  ggplot() + 
    stat_halfeye(aes(x=.value,  y=.variable)) +
    geom_vline(xintercept = 1, linetype = 'dashed')

loyn.rstanarm3 |> 
  gather_draws(`.*AREA.*|.*DIST.*|.*GRAZE.*|.*ALT.*|.*YR.*`, regex=TRUE) |> 
  ggplot() + 
    geom_density_ridges(aes(x=.value, y = .variable), alpha=0.4) +
    geom_vline(xintercept = 0, linetype = 'dashed')
Picking joint bandwidth of 0.0251

##Or on a fractional scale
loyn.rstanarm3 |> 
  gather_draws(`.*AREA.*|.*DIST.*|.*GRAZE.*|.*ALT.*|.*YR.*`, regex=TRUE) |> 
  ggplot() + 
    geom_density_ridges_gradient(aes(x=exp(.value),
                                     y = .variable,
                                     fill = stat(x)),
                                 alpha=0.4, colour = 'white',
                                 quantile_lines = TRUE,
                                 quantiles = c(0.025, 0.975)) +
    geom_vline(xintercept = 1, linetype = 'dashed') +
    scale_x_continuous(trans = scales::log2_trans()) +
    scale_fill_viridis_c(option = "C")
Warning: `stat(x)` was deprecated in ggplot2 3.4.0.
ℹ Please use `after_stat(x)` instead.
Picking joint bandwidth of 0.0363
Warning: Using the `size` aesthetic with geom_segment was deprecated in ggplot2 3.4.0.
ℹ Please use the `linewidth` aesthetic instead.

loyn.rstanarm3 |> spread_draws(`.Intercept.*|.*DIST.*|.*AREA.*|.*GRAZE.*|.*ALT.*|.*YR.*`,  regex=TRUE) 
# A tibble: 1,500 × 13
   .chain .iteration .draw `(Intercept)` `scale(log(DIST))` `scale(log(LDIST))`
    <int>      <int> <int>         <dbl>              <dbl>               <dbl>
 1      1          1     1          2.83           -0.0313              0.133  
 2      1          2     2          3.14            0.0623             -0.00792
 3      1          3     3          3.04            0.0339              0.00387
 4      1          4     4          3.05            0.0593              0.0970 
 5      1          5     5          3.25            0.0645              0.115  
 6      1          6     6          3.10            0.0190             -0.0320 
 7      1          7     7          2.97           -0.0420             -0.0309 
 8      1          8     8          3.06           -0.00284             0.0884 
 9      1          9     9          2.79            0.0817             -0.00417
10      1         10    10          3.02            0.102               0.0491 
# ℹ 1,490 more rows
# ℹ 7 more variables: `scale(log(AREA))` <dbl>, fGRAZE2 <dbl>, fGRAZE3 <dbl>,
#   fGRAZE4 <dbl>, fGRAZE5 <dbl>, `scale(ALT)` <dbl>, `scale(YR.ISOL)` <dbl>
loyn.rstanarm3 |> posterior_samples() |> as.tibble() 
Warning: `as.tibble()` was deprecated in tibble 2.0.0.
ℹ Please use `as_tibble()` instead.
ℹ The signature and semantics have changed, see `?as_tibble`.
Warning: Method 'posterior_samples' is deprecated. Please see ?as_draws for
recommended alternatives.
# A tibble: 1,500 × 11
   `(Intercept)` `scale(log(DIST))` `scale(log(LDIST))` `scale(log(AREA))`
           <dbl>              <dbl>               <dbl>              <dbl>
 1          2.83           -0.0313              0.133               0.276 
 2          3.14            0.0623             -0.00792             0.182 
 3          3.04            0.0339              0.00387             0.260 
 4          3.05            0.0593              0.0970              0.123 
 5          3.25            0.0645              0.115               0.0953
 6          3.10            0.0190             -0.0320              0.242 
 7          2.97           -0.0420             -0.0309              0.354 
 8          3.06           -0.00284             0.0884              0.176 
 9          2.79            0.0817             -0.00417             0.372 
10          3.02            0.102               0.0491              0.0613
# ℹ 1,490 more rows
# ℹ 7 more variables: fGRAZE2 <dbl>, fGRAZE3 <dbl>, fGRAZE4 <dbl>,
#   fGRAZE5 <dbl>, `scale(ALT)` <dbl>, `scale(YR.ISOL)` <dbl>, sigma <dbl>
loyn.rstanarm3 |> bayes_R2() |> median_hdci()
          y      ymin      ymax .width .point .interval
1 0.6506105 0.5049965 0.7459246   0.95 median      hdci

The summary() method generates simple summaries (mean, standard deviation as well as 10, 50 and 90 percentiles).

loyn.brm3 |> summary()
 Family: gaussian 
  Links: mu = log; sigma = identity 
Formula: ABUND ~ scale(log(DIST)) + scale(log(LDIST)) + scale(log(AREA)) + fGRAZE + scale(ALT) + scale(YR.ISOL) 
   Data: loyn (Number of observations: 56) 
  Draws: 3 chains, each with iter = 5000; warmup = 2500; thin = 5;
         total post-warmup draws = 1500

Population-Level Effects: 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept         3.06      0.12     2.82     3.29 1.00     1547     1495
scalelogDIST     -0.01      0.06    -0.13     0.10 1.00     1259     1417
scalelogLDIST     0.05      0.06    -0.06     0.18 1.00     1406     1459
scalelogAREA      0.21      0.06     0.08     0.32 1.00     1521     1420
fGRAZE2           0.03      0.16    -0.28     0.32 1.00     1519     1347
fGRAZE3           0.03      0.14    -0.26     0.31 1.00     1573     1417
fGRAZE4          -0.01      0.17    -0.34     0.33 1.00     1664     1501
fGRAZE5          -1.13      0.37    -1.90    -0.49 1.00     1476     1343
scaleALT          0.00      0.05    -0.10     0.11 1.00     1345     1537
scaleYR.ISOL     -0.00      0.08    -0.15     0.16 1.00     1285     1349

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     6.50      0.69     5.35     7.99 1.00     1448     1499

Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Conclusions:

  • in the Model Info, we are informed that the total MCMC posterior sample size is 1500 and that there were 56 raw observations.
  • the estimated intercept (expected bird abundance when grazing intensity is equal to 1 and all of the continuous predictors are at their respective averages) is 3.06. This is the mean of the posterior distribution for this parameter. If we back-transform this to the response scale, this becomes 21.32.
  • the estimated slope associated with DIST (rate at which the abundance of birds changes per 1 unit change in log-transformed Distance to nearest patch holding other continuous predictors constant and grazing intensity at level 1), is -0.01 (mean) or 0.1 (median) with a standard deviation of 0.06. The 90% credibility intervals indicate that we are 90% confident that the slope is between -0.01 and 1 - e.g. there is no evidence of a significant trend.
  • associated with each of the continuous predictors is a partial slope. Each partial slope is the rate of change between bird abundance and the associated predictor (on the log scale due to the link and based on 1 unit change in the predictor on the scale of the predictor). For example, for every one unit change in centred log patch Area, bird abundance is expected to increase by (log) 0.21.
  • if we back transform this slope (by exponentiation), we get a partial slope for centred log Area of 1.23. This is interpreted as - for every 1 unit increase in (scaled log) Area, the bird abundance is expected to increase 1.23 fold. That is, there is a 22.77 percent increase per 1 unit increase in centred log Area.
  • Rhat and number of effective samples for each parameter are also provided as MCMC diagnostics and all look good.
loyn.brm3 |> tidy_draws() 
# A tibble: 1,500 × 25
   .chain .iteration .draw b_Intercept b_scalelogDIST b_scalelogLDIST
    <int>      <int> <int>       <dbl>          <dbl>           <dbl>
 1      1          1     1        3.02         0.0116         0.0635 
 2      1          2     2        3.08         0.0222         0.125  
 3      1          3     3        3.08        -0.0793         0.0630 
 4      1          4     4        3.19         0.0109         0.141  
 5      1          5     5        2.90         0.0193         0.00407
 6      1          6     6        3.14         0.0468         0.0406 
 7      1          7     7        2.93        -0.0550         0.121  
 8      1          8     8        2.79        -0.0622         0.176  
 9      1          9     9        3.21        -0.101          0.121  
10      1         10    10        3.15        -0.0297         0.0896 
# ℹ 1,490 more rows
# ℹ 19 more variables: b_scalelogAREA <dbl>, b_fGRAZE2 <dbl>, b_fGRAZE3 <dbl>,
#   b_fGRAZE4 <dbl>, b_fGRAZE5 <dbl>, b_scaleALT <dbl>, b_scaleYR.ISOL <dbl>,
#   sigma <dbl>, prior_Intercept <dbl>, prior_b <dbl>, prior_sigma <dbl>,
#   lprior <dbl>, lp__ <dbl>, accept_stat__ <dbl>, treedepth__ <dbl>,
#   stepsize__ <dbl>, divergent__ <dbl>, n_leapfrog__ <dbl>, energy__ <dbl>
loyn.brm3 |>
    tidy_draws() |>
    dplyr::select(starts_with("b_")) |>
    exp() |>
    summarise_draws(median,
                    HDInterval::hdi,
                    rhat,
                    length,
                    ess_bulk, ess_tail)
# A tibble: 10 × 8
   variable        median  lower  upper  rhat length ess_bulk ess_tail
   <chr>            <num>  <num>  <num> <num>  <num>    <num>    <num>
 1 b_Intercept     21.4   16.5   26.3   1.00    1500    1535.    1459.
 2 b_scalelogDIST   0.990  0.877  1.11  1.00    1500    1247.    1395.
 3 b_scalelogLDIST  1.05   0.938  1.19  0.999   1500    1365.    1447.
 4 b_scalelogAREA   1.23   1.09   1.38  1.00    1500    1509.    1414.
 5 b_fGRAZE2        1.04   0.713  1.33  1.00    1500    1488.    1340.
 6 b_fGRAZE3        1.03   0.755  1.34  0.999   1500    1564.    1414.
 7 b_fGRAZE4        0.996  0.669  1.31  1.00    1500    1595.    1457.
 8 b_fGRAZE5        0.330  0.121  0.566 1.00    1500    1462.    1328.
 9 b_scaleALT       1.00   0.898  1.11  1.00    1500    1333.    1512.
10 b_scaleYR.ISOL   0.995  0.852  1.15  1.00    1500    1261.    1334.
loyn.brm3 |>
    spread_draws(`^b_.*|sigma`, regex = TRUE) |>
    exp() |>
    summarise_draws(median,
                    HDInterval::hdi,
                    rhat,
                    length,
                    ess_bulk, ess_tail)
# A tibble: 11 × 8
   variable         median   lower    upper  rhat length ess_bulk ess_tail
   <chr>             <num>   <num>    <num> <num>  <num>    <num>    <num>
 1 b_Intercept      21.4    16.5     26.3   1.00    1500    1547.    1495.
 2 b_scalelogDIST    0.990   0.877    1.11  1.00    1500    1260.    1417.
 3 b_scalelogLDIST   1.05    0.938    1.19  1.00    1500    1406.    1459.
 4 b_scalelogAREA    1.23    1.09     1.38  1.00    1500    1521.    1420.
 5 b_fGRAZE2         1.04    0.713    1.33  1.00    1500    1518.    1347.
 6 b_fGRAZE3         1.03    0.755    1.34  0.999   1500    1573.    1417.
 7 b_fGRAZE4         0.996   0.669    1.31  1.00    1500    1664.    1501.
 8 b_fGRAZE5         0.330   0.121    0.566 1.00    1500    1476.    1343.
 9 b_scaleALT        1.00    0.898    1.11  1.00    1500    1345.    1537.
10 b_scaleYR.ISOL    0.995   0.852    1.15  1.00    1500    1284.    1349.
11 sigma           618.    126.    2284.    1.00    1500    1448.    1499.
loyn.brm3 |> as_draws_df()
# A draws_df: 500 iterations, 3 chains, and 16 variables
   b_Intercept b_scalelogDIST b_scalelogLDIST b_scalelogAREA b_fGRAZE2
1          3.0          0.012          0.0635           0.18     0.042
2          3.1          0.022          0.1248           0.17    -0.127
3          3.1         -0.079          0.0630           0.26    -0.025
4          3.2          0.011          0.1411           0.14    -0.098
5          2.9          0.019          0.0041           0.24     0.286
6          3.1          0.047          0.0406           0.18     0.100
7          2.9         -0.055          0.1210           0.33     0.013
8          2.8         -0.062          0.1760           0.20    -0.040
9          3.2         -0.101          0.1209           0.17    -0.110
10         3.1         -0.030          0.0896           0.21     0.034
   b_fGRAZE3 b_fGRAZE4 b_fGRAZE5
1     0.0105   -0.0082     -0.86
2     0.0084   -0.0057     -0.93
3     0.0361   -0.0822     -0.91
4    -0.0507   -0.0611     -0.96
5     0.1847    0.1297     -0.94
6    -0.0695   -0.0797     -1.04
7     0.0816    0.1376     -0.94
8     0.3040    0.0972     -0.27
9    -0.2545   -0.1272     -1.90
10   -0.0864    0.1405     -1.27
# ... with 1490 more draws, and 8 more variables
# ... hidden reserved variables {'.chain', '.iteration', '.draw'}
loyn.brm3 |>
  as_draws_df() |>
  summarise_draws(
    "median",
    ~ HDInterval::hdi(.x),
    "rhat",
    "ess_bulk"
  )
# A tibble: 16 × 6
   variable             median       lower    upper  rhat ess_bulk
   <chr>                 <num>       <num>    <num> <num>    <num>
 1 b_Intercept        3.06        2.82        3.29  1.00     1547.
 2 b_scalelogDIST    -0.0102     -0.129       0.106 1.00     1260.
 3 b_scalelogLDIST    0.0533     -0.0645      0.177 1.00     1406.
 4 b_scalelogAREA     0.206       0.0879      0.323 1.00     1521.
 5 b_fGRAZE2          0.0377     -0.288       0.322 1.00     1518.
 6 b_fGRAZE3          0.0255     -0.232       0.326 0.999    1573.
 7 b_fGRAZE4         -0.00450    -0.368       0.293 1.00     1664.
 8 b_fGRAZE5         -1.11       -1.86       -0.466 1.00     1476.
 9 b_scaleALT         0.00165    -0.107       0.104 1.00     1345.
10 b_scaleYR.ISOL    -0.00543    -0.160       0.143 1.00     1284.
11 sigma              6.43        5.25        7.81  1.00     1448.
12 prior_Intercept    2.98        1.97        3.92  1.00     1321.
13 prior_b           -0.000828   -3.08        2.79  1.00     1573.
14 prior_sigma        1.95        0.000907    7.76  0.999    1483.
15 lprior           -16.1       -16.9       -15.5   1.00     1398.
16 lp__            -199.       -204.       -194.    0.999    1516.
loyn.brm3$fit |>
    tidyMCMC(estimate.method = 'median',
             conf.int = TRUE,  conf.method = 'HPDinterval',
             rhat = TRUE, ess = TRUE)
# A tibble: 15 × 7
   term              estimate std.error   conf.low conf.high  rhat   ess
   <chr>                <dbl>     <dbl>      <dbl>     <dbl> <dbl> <int>
 1 b_Intercept       3.06        0.117    2.82         3.29  1.00   1512
 2 b_scalelogDIST   -0.0102      0.0602  -0.129        0.106 1.00   1257
 3 b_scalelogLDIST   0.0533      0.0616  -0.0645       0.177 0.999  1374
 4 b_scalelogAREA    0.206       0.0592   0.0879       0.323 0.999  1506
 5 b_fGRAZE2         0.0377      0.156   -0.288        0.322 1.00   1499
 6 b_fGRAZE3         0.0255      0.143   -0.232        0.326 0.999  1567
 7 b_fGRAZE4        -0.00450     0.167   -0.368        0.293 1.00   1659
 8 b_fGRAZE5        -1.11        0.365   -1.86        -0.466 1.00   1454
 9 b_scaleALT        0.00165     0.0527  -0.107        0.104 1.00   1335
10 b_scaleYR.ISOL   -0.00543     0.0780  -0.160        0.143 1.00   1243
11 sigma             6.43        0.690    5.25         7.81  0.999  1440
12 prior_Intercept   2.98        0.501    1.97         3.92  1.00   1316
13 prior_b          -0.000828    1.51    -3.08         2.79  0.999  1552
14 prior_sigma       1.95        2.93     0.000907     7.76  0.998  1470
15 lprior          -16.1         0.401  -16.9        -15.5   0.998  1348

Conclusions:

  • the estimated intercept (expected bird abundance when grazing intensity is equal to 1 and all of the continuous predictors are at their respective averages) is NA. This is the mean of the posterior distribution for this parameter. If we back-transform this to the response scale, this becomes NA.
  • the estimated slope associated with DIST (rate at which the abundance of birds changes per 1 unit change in log-transformed Distance to nearest patch holding other continuous predictors constant and grazing intensity at level 1), is NA (mean) or -0.13 with a standard deviation of -0.01. The 90% credibility intervals indicate that we are 90% confident that the slope is between NA and 0.11 - e.g. there is no evidence of a significant trend.
  • associated with each of the continuous predictors is a partial slope. Each partial slope is the rate of change between bird abundance and the associated predictor (on the log scale due to the link and based on 1 unit change in the predictor on the scale of the predictor). For example, for every one unit change in centred log patch Area, bird abundance is expected to increase by (log) NA.
  • if we back transform this slope (by exponentiation), we get a partial slope for centred log Area of NA. This is interpreted as - for every 1 unit increase in (scaled log) Area, the bird abundance is expected to increase NA fold. That is, there is a NA percent increase per 1 unit increase in centred log Area.
  • Rhat and number of effective samples for each parameter are also provided as MCMC diagnostics and all look good.
loyn.brm3 |> get_variables()
 [1] "b_Intercept"     "b_scalelogDIST"  "b_scalelogLDIST" "b_scalelogAREA" 
 [5] "b_fGRAZE2"       "b_fGRAZE3"       "b_fGRAZE4"       "b_fGRAZE5"      
 [9] "b_scaleALT"      "b_scaleYR.ISOL"  "sigma"           "prior_Intercept"
[13] "prior_b"         "prior_sigma"     "lprior"          "lp__"           
[17] "accept_stat__"   "treedepth__"     "stepsize__"      "divergent__"    
[21] "n_leapfrog__"    "energy__"       
loyn.draw <- loyn.brm3 |> gather_draws(`^b_.*`,  regex = TRUE)
loyn.draw
# A tibble: 15,000 × 5
# Groups:   .variable [10]
   .chain .iteration .draw .variable   .value
    <int>      <int> <int> <chr>        <dbl>
 1      1          1     1 b_Intercept   3.02
 2      1          2     2 b_Intercept   3.08
 3      1          3     3 b_Intercept   3.08
 4      1          4     4 b_Intercept   3.19
 5      1          5     5 b_Intercept   2.90
 6      1          6     6 b_Intercept   3.14
 7      1          7     7 b_Intercept   2.93
 8      1          8     8 b_Intercept   2.79
 9      1          9     9 b_Intercept   3.21
10      1         10    10 b_Intercept   3.15
# ℹ 14,990 more rows
loyn.brm3 |> mcmc_plot(type = 'intervals') 

loyn.brm3 |> 
    gather_draws(`^b_.*`, regex=TRUE) |> 
    filter(.variable != 'b_Intercept') |>
    ggplot() + 
    stat_halfeye(aes(x=.value,  y=.variable)) +
    facet_wrap(~.variable, scales='free')

loyn.brm3 |> 
    gather_draws(`^b_.*`, regex=TRUE) |> 
    filter(.variable != 'b_Intercept') |>
    ggplot() + 
    stat_halfeye(aes(x=.value,  y=.variable)) +
    geom_vline(xintercept = 0, linetype = 'dashed')

loyn.brm3 |> 
    gather_draws(`^b_.*`, regex=TRUE) |> 
    filter(.variable != 'b_Intercept') |>
    ggplot() +  
    geom_density_ridges(aes(x=.value, y = .variable), alpha=0.4) +
    geom_vline(xintercept = 0, linetype = 'dashed')
Picking joint bandwidth of 0.0258

##Or on a fractional scale
loyn.brm3 |> 
    gather_draws(`^b_.*`, regex=TRUE) |> 
    filter(.variable != 'b_Intercept') |>
    ggplot() + 
    geom_density_ridges_gradient(aes(x=exp(.value),
                                     y = .variable,
                                     fill = stat(x)),
                                 alpha=0.4, colour = 'white',
                                 quantile_lines = TRUE,
                                 quantiles = c(0.025, 0.975)) +
    geom_vline(xintercept = 1, linetype = 'dashed') +
    scale_x_continuous(trans = scales::log2_trans()) +
    scale_fill_viridis_c(option = "C")
Picking joint bandwidth of 0.0373

loyn.brm3 |> spread_draws(`^b_.*`,  regex = TRUE) 
# A tibble: 1,500 × 13
   .chain .iteration .draw b_Intercept b_scalelogDIST b_scalelogLDIST
    <int>      <int> <int>       <dbl>          <dbl>           <dbl>
 1      1          1     1        3.02         0.0116         0.0635 
 2      1          2     2        3.08         0.0222         0.125  
 3      1          3     3        3.08        -0.0793         0.0630 
 4      1          4     4        3.19         0.0109         0.141  
 5      1          5     5        2.90         0.0193         0.00407
 6      1          6     6        3.14         0.0468         0.0406 
 7      1          7     7        2.93        -0.0550         0.121  
 8      1          8     8        2.79        -0.0622         0.176  
 9      1          9     9        3.21        -0.101          0.121  
10      1         10    10        3.15        -0.0297         0.0896 
# ℹ 1,490 more rows
# ℹ 7 more variables: b_scalelogAREA <dbl>, b_fGRAZE2 <dbl>, b_fGRAZE3 <dbl>,
#   b_fGRAZE4 <dbl>, b_fGRAZE5 <dbl>, b_scaleALT <dbl>, b_scaleYR.ISOL <dbl>
loyn.brm3 |> posterior_samples() |> as_tibble() 
Warning: Method 'posterior_samples' is deprecated. Please see ?as_draws for
recommended alternatives.
# A tibble: 1,500 × 16
   b_Intercept b_scalelogDIST b_scalelogLDIST b_scalelogAREA b_fGRAZE2 b_fGRAZE3
         <dbl>          <dbl>           <dbl>          <dbl>     <dbl>     <dbl>
 1        3.02         0.0116         0.0635           0.184    0.0421   0.0105 
 2        3.08         0.0222         0.125            0.174   -0.127    0.00843
 3        3.08        -0.0793         0.0630           0.261   -0.0252   0.0361 
 4        3.19         0.0109         0.141            0.140   -0.0984  -0.0507 
 5        2.90         0.0193         0.00407          0.241    0.286    0.185  
 6        3.14         0.0468         0.0406           0.184    0.100   -0.0695 
 7        2.93        -0.0550         0.121            0.333    0.0133   0.0816 
 8        2.79        -0.0622         0.176            0.200   -0.0396   0.304  
 9        3.21        -0.101          0.121            0.166   -0.110   -0.255  
10        3.15        -0.0297         0.0896           0.211    0.0343  -0.0864 
# ℹ 1,490 more rows
# ℹ 10 more variables: b_fGRAZE4 <dbl>, b_fGRAZE5 <dbl>, b_scaleALT <dbl>,
#   b_scaleYR.ISOL <dbl>, sigma <dbl>, prior_Intercept <dbl>, prior_b <dbl>,
#   prior_sigma <dbl>, lprior <dbl>, lp__ <dbl>
loyn.brm3 |> bayes_R2(summary = FALSE) |> median_hdci()
         y      ymin      ymax .width .point .interval
1 0.654261 0.5622812 0.7234179   0.95 median      hdci

Region of Practical Equivalence

For standardised parameter (negligible effect)

0.1 * sd(log(loyn$ABUND))
[1] 0.09024351
loyn.brm3 |> bayestestR::rope_range()
[1] -0.01  0.01
loyn.brm3 |> bayestestR::rope(range = c(-0.09, 0.09))
# Proportion of samples inside the ROPE [-0.09, 0.09]:

Parameter     | inside ROPE
---------------------------
Intercept     |      0.00 %
scalelogDIST  |     91.78 %
scalelogLDIST |     73.67 %
scalelogAREA  |      0.35 %
fGRAZE2       |     43.75 %
fGRAZE3       |     49.37 %
fGRAZE4       |     44.52 %
fGRAZE5       |      0.00 %
scaleALT      |     96.07 %
scaleYR.ISOL  |     79.85 %
loyn.brm3 |> bayestestR::rope(range = c(-0.09, 0.09)) |>
    plot(data = loyn.brm3)

10 Further analyses

loyn.rstanarm4a <- update(loyn.rstanarm3,  .~scale(log(DIST))*scale(log(LDIST)),
                          diagnostic_file = file.path(tempdir(), "dfa.csv"))
loyn.rstanarm4b <- update(loyn.rstanarm3,  .~scale(log(AREA)) * fGRAZE,
                          diagnostic_file = file.path(tempdir(), "dfb.csv"))
loyn.rstanarm4c <- update(loyn.rstanarm3,  .~scale(log(AREA)) * fGRAZE * scale(YR.ISOL),
                          diagnostic_file = file.path(tempdir(), "dfc.csv"))
loyn.rstanarm4d <- update(loyn.rstanarm3,  .~scale(ALT),
                          diagnostic_file = file.path(tempdir(), "dfd.csv"))
loyn.rstanarm4e <- update(loyn.rstanarm3,  .~1,
                          diagnostic_file = file.path(tempdir(), "dfe.csv"))
loo_compare(loo(loyn.rstanarm4a),
            loo(loyn.rstanarm4e)
            )
Warning: Found 1 observation(s) with a pareto_k > 0.7. We recommend calling 'loo' again with argument 'k_threshold = 0.7' in order to calculate the ELPD without the assumption that these observations are negligible. This will refit the model 1 times to compute the ELPDs for the problematic observations directly.
                elpd_diff se_diff
loyn.rstanarm4e  0.0       0.0   
loyn.rstanarm4a -2.5       1.6   
loo_compare(loo(loyn.rstanarm4b),
            loo(loyn.rstanarm4e)
            )
                elpd_diff se_diff
loyn.rstanarm4b   0.0       0.0  
loyn.rstanarm4e -30.6       7.2  
loo_compare(loo(loyn.rstanarm4c),
            loo(loyn.rstanarm4e)
            )
Warning: Found 5 observation(s) with a pareto_k > 0.7. We recommend calling 'loo' again with argument 'k_threshold = 0.7' in order to calculate the ELPD without the assumption that these observations are negligible. This will refit the model 5 times to compute the ELPDs for the problematic observations directly.
                elpd_diff se_diff
loyn.rstanarm4c   0.0       0.0  
loyn.rstanarm4e -22.3       6.7  
loo_compare(loo(loyn.rstanarm4d),
            loo(loyn.rstanarm4e)
            )
                elpd_diff se_diff
loyn.rstanarm4d  0.0       0.0   
loyn.rstanarm4e -3.5       2.3   
bayes_factor(bridge_sampler(loyn.rstanarm4a),
             bridge_sampler(loyn.rstanarm4e))
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Iteration: 6
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Estimated Bayes factor in favor of x1 over x2: 0.00139
bayes_factor(bridge_sampler(loyn.rstanarm4b),
             bridge_sampler(loyn.rstanarm4e))
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Iteration: 6
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Estimated Bayes factor in favor of x1 over x2: 1081033284.51737
bayes_factor(bridge_sampler(loyn.rstanarm4c),
             bridge_sampler(loyn.rstanarm4e))
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Iteration: 6
Iteration: 7
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Estimated Bayes factor in favor of x1 over x2: 7540.31759
bayes_factor(bridge_sampler(loyn.rstanarm4d),
             bridge_sampler(loyn.rstanarm4e))
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Iteration: 6
Estimated Bayes factor in favor of x1 over x2: 4.97595
loyn.brm4a <- update(loyn.brm3,  .~scale(log(DIST))*scale(log(LDIST)),
                     save_pars = save_pars(all = TRUE), refresh = 0)
Start sampling
Running MCMC with 3 sequential chains...

Chain 1 finished in 0.2 seconds.
Chain 2 finished in 0.2 seconds.
Chain 3 finished in 0.2 seconds.

All 3 chains finished successfully.
Mean chain execution time: 0.2 seconds.
Total execution time: 0.7 seconds.
loyn.brm4b <- update(loyn.brm3,  .~scale(log(AREA)) * fGRAZE,
                     save_pars = save_pars(all = TRUE), refresh = 0)
Start sampling
Running MCMC with 3 sequential chains...

Chain 1 finished in 0.3 seconds.
Chain 2 finished in 0.3 seconds.
Chain 3 finished in 0.3 seconds.

All 3 chains finished successfully.
Mean chain execution time: 0.3 seconds.
Total execution time: 1.3 seconds.
loyn.brm4c <- update(loyn.brm3,  .~scale(log(AREA)) * fGRAZE * scale(YR.ISOL),
                     save_pars = save_pars(all = TRUE), refresh = 0)
Start sampling
Running MCMC with 3 sequential chains...

Chain 1 finished in 1.4 seconds.
Chain 2 finished in 1.5 seconds.
Chain 3 finished in 1.7 seconds.

All 3 chains finished successfully.
Mean chain execution time: 1.5 seconds.
Total execution time: 4.9 seconds.
loyn.brm4d <- update(loyn.brm3,  .~scale(ALT),
                     save_pars = save_pars(all = TRUE), refresh = 0)
Start sampling
Running MCMC with 3 sequential chains...

Chain 1 finished in 0.1 seconds.
Chain 2 finished in 0.1 seconds.
Chain 3 finished in 0.1 seconds.

All 3 chains finished successfully.
Mean chain execution time: 0.1 seconds.
Total execution time: 0.8 seconds.
loyn.brm4e <- update(loyn.brm3,  .~1,
                     save_pars = save_pars(all = TRUE), refresh = 0)
The desired updates require recompiling the model
Start sampling
Running MCMC with 3 sequential chains...

Chain 1 finished in 0.1 seconds.
Chain 2 finished in 0.1 seconds.
Chain 3 finished in 0.1 seconds.

All 3 chains finished successfully.
Mean chain execution time: 0.1 seconds.
Total execution time: 0.4 seconds.
waic(loyn.brm4a)
Warning: 
1 (1.8%) p_waic estimates greater than 0.4. We recommend trying loo instead.

Computed from 1500 by 56 log-likelihood matrix

          Estimate  SE
elpd_waic   -215.9 3.6
p_waic         4.5 0.8
waic         431.8 7.2

1 (1.8%) p_waic estimates greater than 0.4. We recommend trying loo instead. 
loo(loyn.brm4a)

Computed from 1500 by 56 log-likelihood matrix

         Estimate  SE
elpd_loo   -216.0 3.6
p_loo         4.6 0.8
looic       432.0 7.2
------
Monte Carlo SE of elpd_loo is 0.1.

Pareto k diagnostic values:
                         Count Pct.    Min. n_eff
(-Inf, 0.5]   (good)     55    98.2%   674       
 (0.5, 0.7]   (ok)        1     1.8%   624       
   (0.7, 1]   (bad)       0     0.0%   <NA>      
   (1, Inf)   (very bad)  0     0.0%   <NA>      

All Pareto k estimates are ok (k < 0.7).
See help('pareto-k-diagnostic') for details.
loo(loyn.brm4e)

Computed from 1500 by 56 log-likelihood matrix

         Estimate  SE
elpd_loo   -213.7 3.9
p_loo         1.6 0.2
looic       427.4 7.7
------
Monte Carlo SE of elpd_loo is 0.0.

All Pareto k estimates are good (k < 0.5).
See help('pareto-k-diagnostic') for details.
loo_compare(loo(loyn.brm4a),
            loo(loyn.brm4e)
            )
           elpd_diff se_diff
loyn.brm4e  0.0       0.0   
loyn.brm4a -2.3       1.6   
loo_compare(loo(loyn.brm4b),
            loo(loyn.brm4e)
            )
Warning: Found 3 observations with a pareto_k > 0.7 in model 'loyn.brm4b'. It
is recommended to set 'moment_match = TRUE' in order to perform moment matching
for problematic observations.
           elpd_diff se_diff
loyn.brm4b   0.0       0.0  
loyn.brm4e -30.4       7.4  
loo_compare(loo(loyn.brm4b, moment_match = TRUE),
            loo(loyn.brm4e)
            )
Recompiling the model with 'rstan'
Recompilation done
Warning: Some Pareto k diagnostic values are slightly high. See help('pareto-k-diagnostic') for details.
           elpd_diff se_diff
loyn.brm4b   0.0       0.0  
loyn.brm4e -30.4       7.4  
loo_compare(loo(loyn.brm4c, moment_match = TRUE),
            loo(loyn.brm4e)
            )
Recompiling the model with 'rstan'
Recompilation done
Warning: Some Pareto k diagnostic values are slightly high. See help('pareto-k-diagnostic') for details.
           elpd_diff se_diff
loyn.brm4c   0.0       0.0  
loyn.brm4e -21.7       6.9  
loo_compare(loo(loyn.brm4d),
            loo(loyn.brm4e)
            )
           elpd_diff se_diff
loyn.brm4d  0.0       0.0   
loyn.brm4e -3.6       2.3   

An alternative is to compute Bayes factors based on bridge sampling. Note, this process usually requires far greater number of posterior samples (10x) in order to be stable. It is also advisable to run this multiple times to ensure stability.

It calculates the marginal likelihood of one model in favour of another. The larger the value, the more evidence there is of one model over the other.

bayes_factor(loyn.brm4a,
            loyn.brm4e)
Recompiling the model with 'rstan'
Recompilation done
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Iteration: 6
Recompiling the model with 'rstan'
Recompilation done
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Iteration: 6
Estimated Bayes factor in favor of loyn.brm4a over loyn.brm4e: 0.00040
#OR
bayes_factor(loyn.brm4e,
            loyn.brm4a)
Recompiling the model with 'rstan'
Recompilation done
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Recompiling the model with 'rstan'
Recompilation done
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Estimated Bayes factor in favor of loyn.brm4e over loyn.brm4a: 2433.63374
bayes_factor(loyn.brm4b,
             loyn.brm4e)
Recompiling the model with 'rstan'
Recompilation done
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Iteration: 6
Recompiling the model with 'rstan'
Recompilation done
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Estimated Bayes factor in favor of loyn.brm4b over loyn.brm4e: 116156587.39543
#OR
bayes_factor(loyn.brm4e,
             loyn.brm4b)
Recompiling the model with 'rstan'
Recompilation done
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Iteration: 6
Recompiling the model with 'rstan'
Recompilation done
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Iteration: 6
Iteration: 7
Estimated Bayes factor in favor of loyn.brm4e over loyn.brm4b: 0.00000
bayes_factor(loyn.brm4c,
             loyn.brm4e)
Recompiling the model with 'rstan'
Recompilation done
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Iteration: 6
Iteration: 7
Iteration: 8
Recompiling the model with 'rstan'
Recompilation done
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Iteration: 6
Estimated Bayes factor in favor of loyn.brm4c over loyn.brm4e: 50.56553
#OR
bayes_factor(loyn.brm4e,
             loyn.brm4c)
Recompiling the model with 'rstan'
Recompilation done
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Recompiling the model with 'rstan'
Recompilation done
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Iteration: 6
Iteration: 7
Iteration: 8
Estimated Bayes factor in favor of loyn.brm4e over loyn.brm4c: 0.02034
bayes_factor(loyn.brm4d,
             loyn.brm4e)
Recompiling the model with 'rstan'
Recompilation done
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Recompiling the model with 'rstan'
Recompilation done
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Estimated Bayes factor in favor of loyn.brm4d over loyn.brm4e: 3.76348
#OR
bayes_factor(loyn.brm4e,
             loyn.brm4d)
Recompiling the model with 'rstan'
Recompilation done
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Recompiling the model with 'rstan'
Recompilation done
Iteration: 1
Iteration: 2
Iteration: 3
Iteration: 4
Iteration: 5
Iteration: 6
Estimated Bayes factor in favor of loyn.brm4e over loyn.brm4d: 0.26484

Compare effect of grazing at different patch areas

loyn.list <- with(loyn, list(AREA = c(min(AREA), mean(AREA), max(AREA))))
 
loyn.brm4b |>
    emmeans(~fGRAZE|AREA, at = loyn.list, type = "response") |>
    pairs(reverse = FALSE)
AREA = 0.1:
 contrast           ratio lower.HPD upper.HPD
 fGRAZE1 / fGRAZE2  2.027   0.87934     4.122
 fGRAZE1 / fGRAZE3  2.234   0.99716     4.190
 fGRAZE1 / fGRAZE4  9.429   0.96176    54.816
 fGRAZE1 / fGRAZE5 11.269   0.88471    70.297
 fGRAZE2 / fGRAZE3  1.111   0.29762     2.204
 fGRAZE2 / fGRAZE4  4.576   0.25708    27.668
 fGRAZE2 / fGRAZE5  5.518   0.37513    34.499
 fGRAZE3 / fGRAZE4  4.179   0.30209    23.917
 fGRAZE3 / fGRAZE5  5.062   0.43230    31.686
 fGRAZE4 / fGRAZE5  1.216   0.02015    10.576

AREA = 69.2696428571429:
 contrast           ratio lower.HPD upper.HPD
 fGRAZE1 / fGRAZE2  0.905   0.69062     1.144
 fGRAZE1 / fGRAZE3  0.848   0.64652     1.122
 fGRAZE1 / fGRAZE4  0.494   0.20616     0.930
 fGRAZE1 / fGRAZE5  1.484   0.36803     3.921
 fGRAZE2 / fGRAZE3  0.937   0.64195     1.285
 fGRAZE2 / fGRAZE4  0.541   0.21900     1.041
 fGRAZE2 / fGRAZE5  1.629   0.35264     4.347
 fGRAZE3 / fGRAZE4  0.582   0.22298     1.116
 fGRAZE3 / fGRAZE5  1.741   0.35246     4.604
 fGRAZE4 / fGRAZE5  3.082   0.60455     9.184

AREA = 1771:
 contrast           ratio lower.HPD upper.HPD
 fGRAZE1 / fGRAZE2  0.597   0.29537     0.981
 fGRAZE1 / fGRAZE3  0.521   0.25267     0.921
 fGRAZE1 / fGRAZE4  0.113   0.00336     0.524
 fGRAZE1 / fGRAZE5  0.539   0.00253     3.518
 fGRAZE2 / fGRAZE3  0.868   0.29228     1.743
 fGRAZE2 / fGRAZE4  0.189   0.00625     0.965
 fGRAZE2 / fGRAZE5  0.883   0.00289     6.202
 fGRAZE3 / fGRAZE4  0.218   0.00322     1.048
 fGRAZE3 / fGRAZE5  1.013   0.00437     7.075
 fGRAZE4 / fGRAZE5  4.790   0.00837    56.746

Point estimate displayed: median 
Results are back-transformed from the log scale 
HPD interval probability: 0.95 
newdata <- loyn.brm4b |>
    emmeans(~fGRAZE|AREA, at = loyn.list, type = 'response') |>
    pairs() |>
    as.data.frame()
head(newdata)
 contrast          AREA     ratio lower.HPD upper.HPD
 fGRAZE1 / fGRAZE2 0.1   2.026957 0.8793432   4.12190
 fGRAZE1 / fGRAZE3 0.1   2.234104 0.9971590   4.18955
 fGRAZE1 / fGRAZE4 0.1   9.429410 0.9617564  54.81644
 fGRAZE1 / fGRAZE5 0.1  11.268673 0.8847058  70.29720
 fGRAZE2 / fGRAZE3 0.1   1.110790 0.2976185   2.20414
 fGRAZE2 / fGRAZE4 0.1   4.575734 0.2570813  27.66753

Point estimate displayed: median 
Results are back-transformed from the log scale 
HPD interval probability: 0.95 
newdata <- loyn.brm4b |> 
    emmeans(~fGRAZE|AREA, at = loyn.list, type = 'response') |>
    pairs() |>
    gather_emmeans_draws()

newdata |> median_hdci() |>
    ggplot() +
    geom_hline(yintercept = 1, linetype = 'dashed') +
    geom_pointrange(aes(y = .value, ymin = .lower, ymax = .upper, x = contrast)) +
    facet_wrap(~AREA) +
    coord_flip()

loyn.brm4b |>
    emmeans(~fGRAZE|AREA, at = loyn.list, type = 'response') |>
    gather_emmeans_draws() 
# A tibble: 22,500 × 6
# Groups:   fGRAZE, AREA [15]
   fGRAZE  AREA .chain .iteration .draw .value
   <fct>  <dbl>  <int>      <int> <int>  <dbl>
 1 1        0.1     NA         NA     1   2.87
 2 1        0.1     NA         NA     2   2.91
 3 1        0.1     NA         NA     3   2.52
 4 1        0.1     NA         NA     4   2.61
 5 1        0.1     NA         NA     5   2.88
 6 1        0.1     NA         NA     6   3.12
 7 1        0.1     NA         NA     7   2.96
 8 1        0.1     NA         NA     8   2.86
 9 1        0.1     NA         NA     9   3.32
10 1        0.1     NA         NA    10   2.90
# ℹ 22,490 more rows
newdata.p <- newdata |> summarise(P = mean(.value>1))
`summarise()` has grouped output by 'contrast'. You can override using the
`.groups` argument.
g <- newdata |>
    ggplot() +
    geom_vline(xintercept = 1, linetype = 'dashed') +
    stat_slab(aes(x  =  .value, y = contrast,
                  fill = stat(ggdist::cut_cdf_qi(cdf,
                           .width = c(0.5, 0.8, 0.95), 
                           labels = scales::percent_format())
                           )), color = 'black') + 
    scale_fill_brewer('Interval', direction = -1, na.translate = FALSE) +
    facet_grid(~round(AREA,1))
g + geom_text(data = newdata.p, aes(y = contrast, x = 1, label = paste('P = ',round(P,3))), hjust = -0.2, position = position_nudge(y = 0.5))

11 Summary figure

loyn.list <- with(loyn, list(AREA = modelr::seq_range(AREA, n=100)))
 
newdata <- emmeans(loyn.brm3, ~AREA|fGRAZE, at = loyn.list, type='response') |>
    as.data.frame()
head(newdata)
     AREA fGRAZE response lower.HPD upper.HPD
  0.10000 1      13.35678  8.347263  21.02803
 17.98788 1      23.72537 18.959959  28.43506
 35.87576 1      25.60432 21.472996  30.50046
 53.76364 1      26.77315 22.353425  31.31426
 71.65152 1      27.61484 23.291129  32.27396
 89.53939 1      28.27732 23.650035  32.87856

Point estimate displayed: median 
Results are back-transformed from the log scale 
HPD interval probability: 0.95 
ggplot(newdata, aes(y=response, x=AREA)) +
  geom_ribbon(aes(ymin=lower.HPD, ymax=upper.HPD, fill=fGRAZE), alpha=0.3) +
  geom_line(aes(color=fGRAZE)) +
  theme_bw() +
  scale_x_log10() + 
  scale_y_log10()

spaghetti = emmeans(loyn.rstanarm3, ~AREA|fGRAZE, at = loyn.list, type='response') |>
  gather_emmeans_draws() |>
  mutate(Fit=exp(.value))
Warning: Model has 0 prior weights, but we recovered 56 rows of data.
So prior weights were ignored.
wch = sample(1:max(spaghetti$.draw), 100,replace=FALSE)
spaghetti = spaghetti |>
  filter(.draw %in% wch)
ggplot(newdata) +
  geom_line(data=spaghetti, aes(y=Fit, x=AREA, color=fGRAZE,
                                group=interaction(fGRAZE,.draw)), alpha=0.05) +
  geom_line(aes(y=response, x=AREA, color=fGRAZE)) +
  theme_bw() +
  scale_x_log10() + scale_y_log10()

## or honouring the data range
loyn.nd <- loyn |>
    group_by(fGRAZE) |>
    expand(AREA = modelr::seq_range(AREA, n=100),
           DIST = mean(DIST), LDIST = mean(LDIST), ALT = mean(ALT), YR.ISOL = mean(YR.ISOL))
loyn.rstanarm3 |>
    epred_draws(newdata = loyn.nd, value = '.value') |>
    median_hdci() |>
    ggplot(aes(x = AREA, y = .value, colour = fGRAZE, fill = fGRAZE)) +
    geom_ribbon(aes(ymin = .lower, ymax = .upper), colour = NA, alpha = 0.3) +
    geom_line() +
    geom_point(data = loyn, aes(y = ABUND, x = AREA)) +
    scale_y_log10() +
    scale_x_log10() +
    theme_bw()

loyn.list = with(loyn, list(AREA = seq(min(AREA), max(AREA), len=100)))

newdata = emmeans(loyn.rstanarm4b, ~AREA|fGRAZE, at = loyn.list, type='response') |>
    as.data.frame()
Warning: Model has 0 prior weights, but we recovered 56 rows of data.
So prior weights were ignored.
head(newdata)
     AREA fGRAZE response lower.HPD upper.HPD
  0.10000 1      19.33407  11.90402  27.95127
 17.98788 1      26.26695  22.54694  30.20361
 35.87576 1      27.30720  23.76920  30.55262
 53.76364 1      27.95785  24.77001  31.21964
 71.65152 1      28.41624  25.34805  31.79729
 89.53939 1      28.83454  25.62688  32.11486

Point estimate displayed: median 
Results are back-transformed from the log scale 
HPD interval probability: 0.95 
ggplot(newdata, aes(y=response, x=AREA)) +
  geom_ribbon(aes(ymin=lower.HPD, ymax=upper.HPD, fill=fGRAZE), alpha=0.3) +
  geom_line(aes(color=fGRAZE)) +
  theme_bw() +
  scale_x_log10() + 
  scale_y_log10()

spaghetti = emmeans(loyn.rstanarm4b, ~AREA|fGRAZE, at = loyn.list, type='response') |>
  gather_emmeans_draws() |> mutate(Fit=exp(.value))
Warning: Model has 0 prior weights, but we recovered 56 rows of data.
So prior weights were ignored.
wch = sample(1:max(spaghetti$.draw), 100,replace=FALSE)
spaghetti = spaghetti |> filter(.draw %in% wch)
ggplot(newdata) +
  geom_line(data=spaghetti, aes(y=Fit, x=AREA, color=fGRAZE,
                                group=interaction(fGRAZE,.draw)), alpha=0.05) +
  geom_line(aes(y=response, x=AREA, color=fGRAZE)) +
  theme_bw() +
  scale_x_log10() + scale_y_log10()

## or honouring the data range
loyn.nd <- loyn |>
    group_by(fGRAZE) |>
    expand(AREA = modelr::seq_range(AREA, n=100))
loyn.rstanarm4b |>
    epred_draws(newdata = loyn.nd, value = '.value') |>
    median_hdci() |>
    ggplot(aes(x = AREA, y = .value, colour = fGRAZE, fill = fGRAZE)) +
    geom_ribbon(aes(ymin = .lower, ymax = .upper), colour = NA, alpha = 0.3) +
    geom_line() +
    geom_point(data = loyn, aes(y = ABUND, x = AREA)) +
    scale_y_log10() +
    scale_x_log10() +
    theme_bw()

loyn.list <- with(loyn, list(AREA = modelr::seq_range(AREA, n=100)))

newdata <- emmeans(loyn.brm3, ~AREA|fGRAZE, at = loyn.list, type='response') |>
    as.data.frame() 
head(newdata)
     AREA fGRAZE response lower.HPD upper.HPD
  0.10000 1      13.35678  8.347263  21.02803
 17.98788 1      23.72537 18.959959  28.43506
 35.87576 1      25.60432 21.472996  30.50046
 53.76364 1      26.77315 22.353425  31.31426
 71.65152 1      27.61484 23.291129  32.27396
 89.53939 1      28.27732 23.650035  32.87856

Point estimate displayed: median 
Results are back-transformed from the log scale 
HPD interval probability: 0.95 
ggplot(newdata, aes(y = response, x = AREA)) +
  geom_ribbon(aes(ymin = lower.HPD, ymax = upper.HPD, fill = fGRAZE), alpha = 0.3) +
  geom_line(aes(color = fGRAZE)) +
  theme_bw() +
  scale_x_log10() + 
  scale_y_log10()

spaghetti = emmeans(loyn.brm3, ~AREA|fGRAZE, at = loyn.list, type = 'response') |>
  gather_emmeans_draws() |> mutate(Fit = exp(.value))
wch <- sample(1:max(spaghetti$.draw), 100,replace = FALSE)
spaghetti <- spaghetti |> filter(.draw %in% wch)
ggplot(newdata) +
  geom_line(data = spaghetti, aes(y = Fit, x = AREA, color = fGRAZE,
                                group = interaction(fGRAZE,.draw)), alpha = 0.1) +
  geom_line(aes(y = response, x = AREA, color = fGRAZE)) +
  theme_bw() +
  scale_x_log10() + scale_y_log10()

# or honouring the data range
loyn.nd <- loyn |>
    group_by(fGRAZE) |>
    expand(AREA = modelr::seq_range(AREA, n=100),
           DIST = mean(DIST), LDIST = mean(LDIST), ALT = mean(ALT), YR.ISOL = mean(YR.ISOL))
loyn.brm3 |>
    epred_draws(newdata = loyn.nd, value = '.value') |>
    median_hdci() |>
    ggplot(aes(x = AREA, y = .value, colour = fGRAZE, fill = fGRAZE)) +
    geom_ribbon(aes(ymin = .lower, ymax = .upper), colour = NA, alpha = 0.3) +
    geom_line() +
    geom_point(data = loyn, aes(y = ABUND, x = AREA)) +
    scale_y_log10() +
    scale_x_log10() +
    theme_bw()

Lets explore the relationship between bird abundance and patch area for each grazing intensity separately.

loyn.list <- with(loyn, list(AREA = modelr::seq_range(AREA, n = 100)))

newdata <- emmeans(loyn.brm4b, ~AREA|fGRAZE, at = loyn.list, type='response') |>
    as.data.frame()
head(newdata)
     AREA fGRAZE response lower.HPD upper.HPD
  0.10000 1      19.47290  12.51028  28.00172
 17.98788 1      26.29237  22.78493  30.10815
 35.87576 1      27.39322  24.28281  30.72652
 53.76364 1      28.04763  24.98364  31.22063
 71.65152 1      28.51268  25.33922  31.45183
 89.53939 1      28.88930  25.63054  31.82012

Point estimate displayed: median 
Results are back-transformed from the log scale 
HPD interval probability: 0.95 
ggplot(newdata, aes(y = response, x = AREA)) +
  geom_ribbon(aes(ymin = lower.HPD, ymax = upper.HPD, fill = fGRAZE), alpha = 0.3) +
  geom_line(aes(color = fGRAZE)) +
  theme_bw() +
  scale_x_log10() + 
  scale_y_log10()

spaghetti = emmeans(loyn.brm4b, ~AREA|fGRAZE, at = loyn.list, type = 'response') |>
  gather_emmeans_draws() |> mutate(Fit = exp(.value))
wch <- sample(1:max(spaghetti$.draw), 100,replace = FALSE)
spaghetti <- spaghetti |> filter(.draw %in% wch)
ggplot(newdata) +
  geom_line(data = spaghetti, aes(y = Fit, x = AREA, color = fGRAZE,
                                group = interaction(fGRAZE,.draw)), alpha = 0.1) +
  geom_line(aes(y = response, x = AREA, color = fGRAZE)) +
  theme_bw() +
  scale_x_log10() + scale_y_log10()

## or honouring the data range
loyn.nd <- loyn |>
    group_by(fGRAZE) |>
    expand(AREA = modelr::seq_range(AREA, n=100))
loyn.brm4b |>
    epred_draws(newdata = loyn.nd, value = '.value') |>
    median_hdci() |>
    ggplot(aes(x = AREA, y = .value, colour = fGRAZE, fill = fGRAZE)) +
    geom_ribbon(aes(ymin = .lower, ymax = .upper), colour = NA, alpha = 0.3) +
    geom_line() +
    geom_point(data = loyn, aes(y = ABUND, x = AREA)) +
    scale_y_log10() +
    scale_x_log10() +
    theme_bw()

12 References

Loyn, R. H. 1987. “Nature Conservation: The Role of Remnants of Native Vegetation.” In, edited by D. A. Saunders, G. W. Arnold, A. A. Burbridge, and A. J. M. Hopkins. Chipping Norton, NSW: Surrey Beatty & Sons.